1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
|
Unified lists, arrays (and iterators/pipes?)
============================================
Problems with lists (vs arrays):
1. Can't be stored directly in strutcs/arrays (need a reference).
2. Types smaller than a pointer use extra space
(i.e. byte, int16, sometimes int32, and on CHERI platforms even int64).
3. The length cannot be specified in the type system
Solution to 1 & 2: Serialization
--------------------------------
Fixed-size lists in structs could be stored in serialized form.
For example, for this struct:
struct {
uint16 info
List<byte,3> array
uint16 trailer
}
It could be stored like this in memory:
(on a 32-bit platform, but the technique works on 64-bit as well):
uint16 info
uint16 alignment_padding1
uint32 serialization_marker_and_type (type = immutable byte array)
uint32 length (small lengths could be merged into the marker)
byte value1
byte value2
byte value3
byte alignment_padding2
uint16 trailer
Note: The serialization marker could include a mutability bit to allow
for modification (but not size change) or serialized lists.
Advantages:
* Can be used where a reference to a list is expected
(no copying needed).
* Can be used in .rodata
* Can be used for mutable but fixed-size lists.
Disadvantages:
* Can obviously only support fixed-size lists.
* Uses up to two additional words + alignment padding before and after.
(in the example, it is 8+2 = 10 additional bytes for a 3 byte array,
on a 64-bit system it would have been 16+7 = 25 additional bytes.
I.e. 4.3x vs 8.3x memory usage, but this is close to the worst case)
* The serialization format of immutable-List<byte> is different from String.
Note that String cannot have a fully compatible format, because it is
not fixed size and it can additionally change size on element mutation
(non-UTF-8 or NUL bytes in the beginning require a header with length etc).
Solution to 3
-------------
Allow a type parameter to be:
* Type only <T>
* Value only <v>
* Value+Type <v T>
* ?Value+Type <v T> or <T>
Usage syntax:
Cache<byte>
BitArray<3>
List<byte>
List<3 byte>
List<len byte>
Definition syntax with support for usize only:
(should booleans, enums and signed integers be allowed?)
type Cache<T>
type BitArray<len>
type List<?len T>
Problem: Confusing when some types can be used without "ref"
------------------------------------------------------------
It could be confusing that List<n T> can be placed directly in a
struct/array, but List<T> or "OtherType" or "PrivateType" can't.
Solutions:
* Automatically add "ref" if needed?
* Support serialization on user-defined types?
* Make List<T> a fully built-in type and call it
(along with string) by a lower-case name?
|