aboutsummaryrefslogtreecommitdiff
path: root/notes/base_profile_and_extensions.txt
blob: 413af19a4eb6a4c632472bc0ebdebc5e0b8f1709 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

Basic profile and extensions
============================

SLUL "profiles":

* Minimal (used for bootstrapping)
    - no signed
    - only 32 bit integers
    - no negation operator
    - no float
    - no extensions
* Base
* Sloppy
    - semi-dynamic typing
    - compat syntax for function decls:  `f a b -> x y`

Semi-dynamic typing
-------------------

Types are inferred from usages/operations:

* which methods are called
* which operators are used
* can it be `none`?
* for integers: what is the maximum range
* can it be modified or not?
* can it be aliased?

Note that this should not be allowed in library interfaces!

When types cannot be inferred:

* unknown-sized unsigned integers are 64 bit unsigned
* unknown-sized possibly-signed integers are 65 bit signed
    - this might be represented as a variant
* references are variants
* runtime checking is performed when going back to static
  types or known-but-dynamic types:
    - nullability is checked
    - mutability is checked
    - aliasing is checked
    - range is checked

Note that *if possible to implement*, mixing of types shouldn't be allowed.
This is necssary to make it easy to go from dynamic to static typing.
E.g. the following should not be allowed:

    if a <> false    # assuming boolean
        int i = a    # assuming int this time, but it's already bool
    end

Possible layout of the variant type:

    enum SlulVariantKind {
        SLUL_V_NONE,
        SLUL_V_BOOL,
        SLUL_V_INT,
        SLUL_V_UINT,
        SLUL_V_BYTE,
        SLUL_V_SBYTE,
        ...
        SLUL_V_LONG,
        SLUL_V_ULONG,
        /* This should be a bitmask.
           TODO: How should the type info be stored? */
        SLUL_V_REF,
        SLUL_V_REF_MODIF,
        SLUL_V_REF_ALIASED,
        SLUL_V_REF_VOLATILE,
        /* Solution 1 for multi-threading handling:
           This could be used to prevent half-updated variants from being
           seen. It could be set during changes (via compare-and-set),
           before setting the union value. And finally the real `kind`
           could be set.
           (But wouldn't this be affected by the ABA problem?) */
        SLUL_V_LOCKED_FOR_UPDATE
    };

    struct SlulVariant {
        enum SlulVariantKind kind;
        /* Solution 2 for multi-threading handling:
           Optimistic "locking" with retries with an `update_number`. */
        unsigned update_number;
        union {
            bool boolval;
            unsigned char u8;
            signed char s8;
            ...
            uint64_t u64;
            int64_t s64;
            void *ptr;
        } u;
    };

    /* Solution 3 for multi-threading handling:
       Store the kind and the pointer in the same place, such that
       reads from the pointer will either give a real, valid, pointer OR
       it will give a low enough (page 0) pointer that cannot ever be
       a valid address.

       Access in multithreaded contexts just means that the value has to be
       validated (valid pointer or valid integer range) before it can be
       used. One can still get the wrong value (e.g. alternating integer
       0 and bool true can result in an incorrect bool false or integer 1) */
    struct SlulVariant {
        union {
            enum SlulVariantKind kind;  /* for non-reference-types */
            void *ptr;                  /* for reference types */
        } u1;
        union {
            enum SlulRefKind kind;      /* for reference types. XXX there's a race condition here: this "subtype" field can be wrong! */
            /* the ones below are for non-reference types */
            bool boolval;
            unsigned char u8;
            signed char s8;
            ...
            uint64_t u64;
            int64_t s64;
        } u2;
    };

    /* Solution 4: Simply don't allow changes of the type. This should be
       fine, but programmers who are used to e.g. Javascript or Python
       might be confused (but it is still uncommon to change the type).

       This is a very simple solution. Maybe it's the best one? */

For interfacing with dynamic code from static code, it can be useful to
have an additional static type for `uint65`.