notes/list_and_usability.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147


Lists and usability
===================

There are two list/array types:

    []T
    list<T>

Can these two be reduced to one?
- Yes, if list<T> can be made as good as []T is,
  or vice versa.

Requirements:
1. lists with static size should be able to work as a value type
2. lists should be able to be fully constant without (absolute) pointers
3. it should be possible to specify a fixed length
4. an automatic conversion is needed (and it should always work if
   the conversion is sensible).

Syntax to use?
    list<T>         []      []
    list<T,l>       [l]T    [l]
    list<list<T>>   [,]T    [][]T

How to indicate that the list itself, the elements, or both, should be
modifiable?
* Most simple solution: "var []T", "[]var T", "var []var T"
* Better solution:      "[var]T",  "[]var T", "[var]var T"
* More aesthetic sol.:  "[var]T",  "var []T", "var [var]T" (best?)
  ...BUT for reference types there is also the problem of whether
  the object itself or the reference should be modifiable. So there
  are actually three(/four) things that can be modifable:
  - The array length
  - The array "slots" (and this can be modifiable for constant-sized arrays)
  - The array "element contents"
  (- The reference to the array, but this can use "var ref ..." syntax)
* Solution with parentheses that supports all 4 cases:
    [var]T      <-- variable length
    (var [])T   <-- variable array "slots"
    []var T     <-- variable array "element contents"
    var []T     <-- variable reference to array
* Alternative solution: Swap order of types to right-to-left:
    T[var]      <-- variable length
    T var[]     <-- variable array "slots"
    var T[]     <-- variable array "element contents"
    T[] var     <-- variable reference to array
    - This also "simplifies" reference syntax
      (but maybe at expense intuitivity?):
        - var T x   <-- modifiable T
        - T var x   <-- modifiable reference
        - T ref x   <-- immutable with explicit reference (needed inside structs)
        - T inplace x <-- immutable with explicit non-reference (needed inside structs)
* For non-reference types, there is no difference between "slots" and "element contents"!
  (it might be more logical to consider them to be "slots")

How to define functions for [] lists?
* Solution 1:
  - Translate to list<T> internally, after checking the length.
    This way [] lists can be extended with new functions.
* Solution 2:
  - Have a special syntax for defining functions in list types.
    Reserve T as a special name?
        func []T.length() -> usize
        func var []T.add(T elem)
    Or use a new type param syntax:
        func<ET> []ET.length() -> usize
        func<ET> var []ET.add(ET elem)
  - Possible future extension:
    This could also work with specific list types (but it would complicate
    the compiler):
        func [4]byte.to_uint32() -> int<0..2^32-1>
        func [16]T.contains(T elem) -> bool
        func []int.largest() -> int

How to handle parameters with []ET and [l]ET?
- Always pass parameters as if they were dynamically-sized lists
  (and possibly non-contiguously allocated in memory)?
- Structs can contain either (ref vs. inline).
  Inline allocation is only allowed for lists with a compile-time known size.

How about multi-dimensional arrays?
- Transforming every sub-array in a large array between dynamic size and
  compile-time constant size is going to be slow.
- The best solution is most likely to just require an exact match
  of type/length.
- Jagged arrays with a single allocation (or relative pointers) is required
  for allocation of multi-dimensional array constants.
    - A general solution that supports elements of either string or array
      type is desirable (and perhaps also map types?)

Syntax for in-place-allocated objects in arrays?
- With left-to-right syntax for types:
    []inplace T
    []T inplace       <-- needs lookahead!
    []Th<T> inplace   <-- it gets more complicated with generic types
- With right-to-left syntax for types:
    inplace T[]
    T inplace[]

How to optimize []byte? / Should "[]inline SomeType" be allowed?

Copy constructors:
    []T b = .copy(arena, a)
    []T b = .reuse(arena, a)

    func<var ET> .copy(arena, ET e) -> var []ET
    func<aliased ET> .reuse(arena, ET e) -> var []ET
        lifetime e >= return

    - This syntax for functions with generic types would also simplify parsing
    - Aliasing in shallow copies. How to handle it?
    - How to copy the elements???
        Solution 1: Have "typeinfo" qualifiers for type parameters?
            func<typeinfo ET> .copy(arena, ET e) -> []ET
            func<known ET> .copy(arena, ET e) -> []ET
            - What to name the keyword? "typeinfo"? "known"?
            - what about nested copying? We need to know the typeinfo of
              the nested types also!
            - Maybe the structure should be as follows:
                type TypeInfo<T> = struct {
                    func<known T> copy(arena, T x) -> T
                }


List reference/Data structure layout:
- The data structure needs to be able to handle byte arrays
- bit arrays (bool[] or Some2ValueEnum[]) would also be nice to have.
- It would be great if string can be implemented as an alias for []byte
  (but not at the syntactic/semantic level, because special care has to
   be taken for multi-byte characters)

How about lists with a generic element type?
    func f<T>(T[] list, T elem) {
        if not list.is_empty() {
            T otherelem = list.first()
            # ...
        }
    }
    - Which types can elem and otherelem be of?
       - objects?   (yes)       <- size of pointer
       - records?   (yes?)      <- size of record OR size of pointer (lifetimes need to be handled)
       - bytes?     (yes)       <- 1 byte
       - booleans?  (yes?)      <- 1 bit. in word-sized chunks or in byte chunks?
       - functions? (no?)       <- size of code pointer
    - Those have different sizes
    - The comparison needs to be done differently.
      Records need typeinfo or a comparion function to compare.