Notes: Usability, references, numeric types / comparisons, etc.HEAD main

author: Samuel Lidén Borell <samuel@kodafritt.se> 2024-06-02 21:20:48 +0200
committer: Samuel Lidén Borell <samuel@kodafritt.se> 2024-06-02 21:20:48 +0200
commit: 580bf6130632f6855fddeea7b07c8401c56108f2 (patch)
tree: 4bd5e7cdb68408c52ad8df030f7f887c7d97def0 /notes/slul2.txt
parent: db73835b12f41be8766384a1cdcc34a0848354dc (diff)
download: slul-main.tar.gz
slul-main.tar.bz2
slul-main.zip
1 files changed, 340 insertions, 0 deletions
diff --git a/notes/slul2.txt b/notes/slul2.txt
new file mode 100644
index 0000000..9d103d0
--- /dev/null
+++ b/notes/slul2.txt
@@ -0,0 +1,340 @@
+SLUL2
+=====
+
+Making SLUL:
+* easier to use
+* easier to implement
+* more future-proof / more portable
+
+Desirable changes:
+
+* Revise ref
+    - forbidding refs to non-struct/non-array types might enable some optimizations
+    - *removing* explicit refs would definitely have impacts on usability.
+      not sure if good or bad. it makes the proglang more implicit/"magic",
+      which can be a bad thing.
+* Implicit arenas?
+    - and for mutating methods, use the 4 "allocation variants": (inspired by Vale)
+        1. placement new        (uses arena only for "indirect"/references fields)
+        2. arena new            (allocate in given arena)
+        3. modify self          (in-place modification. Can keep referenced data)
+        4. discarding self-mod  (in-place modification. Cannot keep referenced data)
+        some of these are applicable to constructors also
+* Garbage-collected areanas?
+    - I.e. local GC
+    - It would remove "anxiety" around memory allocation
+    - Downside 1: It requires meta-data, which isn't needed with plain
+      arena allocation.
+    - Downsides N: The usual downsides with GC
+* Revise expr integer types
+  What are the use-cases for "non-plain ints"?
+    - length type / ssize/usize
+    - byte/int16 arrays
+    - small/bitsliced fields in structs
+    - fixed-size wrapping uints (e.g. for hash functions).
+* Move some stuff from hard-coded syntax to code? e.g. like Scheme, REBOL, Nim.
+    - might actually work with statements:
+        - they can only appear inside function bodies
+        - so the toplevels are available, and their types are known.
+    - But it would definitely need inlining to work with reasonable speed.
+    - Perhaps a bad idea after all?
+* Misc syntax stuff:
+    - Use tabs instead of spaces?
+      But this gets tricky with alignment of e.g. parameters.
+
+
+Control statements defined in library module headers
+----------------------------------------------------
+
+These need some kind of analysis of the IR to check varstates (liveness, etc).
+It also needs to handle nested if-elseif-elseif...
+
+So perhaps this is a bad idea?
+
+Example:
+
+    statement "if" Expr cond Statements true_block "else" Statements false_block
+    {
+        cond
+        CONDJUMP FALSE false_block
+        true_block
+        JUMP end
+        false_block
+        end
+    }
+    statement
+
+Super-simple proglang
+---------------------
+
+Only two/three kinds of typedefs. Not allowed as anonymous types (or? it is useful for e.g. return values)
+
+    record SomeStruct {
+        int field1
+        OtherType field2    # <--- compiler chooses whether to put in ref or not
+                            # this makes FFI trickier. But non-closed types are always refs.
+                            # this also makes lifetimes and aliasing trickier.
+                            # maybe "var" should not be allowed to alias?
+    }
+
+    enum SomeEnum {
+        ...
+    }
+
+    # Maybe some kind of sum/union/variant type
+    record ExprNode {
+        ExprType type
+        int line
+        int column
+        switch type {
+        case .unary
+        case .binary
+            Expr operand_a
+            if type == .binary {
+                Expr operand_b
+            }
+        case .call
+            Expr func_expr
+            int num_args
+            # have a built-in list type?
+            # and choose the best possible representation?
+            # (in this case it's runtime-determined frozen-length, so it could be a pointer to an array. or a full-blown list type)
+            int[num_args] args
+        }
+    }
+
+    # Maybe some kind of constraints
+    func process_op(ExprNode<.type in (.unary, .binary)> expr)
+    func process_op(ExprNode expr [.unary, .binary])
+    func process_op(ExprNode(.unary .binary) expr)
+    func process_op(ExprNode<.unary .binary> expr)
+    func ExprNode<.unary .binary>.process_op()
+    func ExprNode.process_op()
+        for (.unary, .binary)
+    func ExprNode.process_op()
+        with (.unary, .binary)
+    func ExprNode.process_op()
+        this in (.unary, .binary)
+    func ExprNode.process_op()
+        given type == .unary or type == .binary
+
+Qualifiers for records and enums:
+
+    record Point closed {  # (require a newline here?)
+        int x
+        int y
+        # no more fields can be added. allows some optimizations, such as call-by-value / embedding into structs
+    }
+
+    enum SubPixel closed {
+        .red
+        .green
+        .blue
+    }
+
+Enums can have a base type and/or integral values also
+(this is mainly useful for FFI)
+
+    enum StatusByte closed byte {
+        .ready = 10
+        .running = 20
+        .failure = 90
+    }
+
+Integer / elementary types:
+
+* Perhaps even use variable-size integers?
+  The downside is that += 1 etc. might require allocation.
+
+Methods:
+
+* Skip "this". But disallow shadowing.
+
+Type identifiers
+
+* For consistency, always include the "." in typeidentifiers, even in
+  e.g. enum definitions.
+* Constructors are maybe not that intuitive (can they be improved?):
+
+    func .new(int a, int b) -> Thing
+
+
+Avoiding punctuation:
+
+* Can the . in typeidentifiers be skipped?
+* Can the () in function calls be skipped?
+    - if the function call fits on one line
+    - (unless a comma is required between them) and the parameters are terms
+    - and the function call is not nested inside
+      a function call, field or index expression.
+    - related: tuples. but that would be ambiguouos if used as function arguments
+* Can the () in function declarations be skipped?
+
+    func example
+        int a
+        int b
+        return bool
+    {
+        if a == b {
+            otherfunc a, 123
+            return true
+        }
+        
+    }
+
+Can refs be avoided?
+
+    # objects:
+    # These are always passed by reference.
+    # References can be compared with "ref_is" or "is" or a similar operator.
+    # The "==" and "!=" operators are not allowed (maybe it should be allowed to implement them? e.g. with a method called "equals"?)
+    type Box = object {
+        # These are references:
+        Item a
+        Item b
+        # Perhaps allow syntax like this:
+        Item a1, b1
+        Item a1, Item b1
+        # Regarding tuples:
+        # I think that maybe they CAN be references if it too large to use values :)
+        # - We can require that if the object is mutable, it must also be passed by arena-ref.
+        # - Tuples up to some certain size could be embedded / passed by value
+        #   (Check the optimal limit. It's at least the size of two pointers, but it could be larger)
+        # - Tuples allocated in the *same* arena can just be referenced directly!
+        #   (this should be fairly simple and fast to check).
+        #   - If each thread uses a contiguous virtual-memory block,
+        #     then this would be a trivial range check.
+        # - Tuples allocated by the same thread and in SLUL code, can
+        #   (as an optional optimization) be referenced if
+        #   1) the lifetime allows it (how to check this at runtime?), or
+        #   2) the runtime uses garbage collection, and can perform GC in
+        #   this case.
+        # - Tuples allocated in SLUL code from other threads may or may not
+        #   be possible to reference depending on whether the runtime
+        #   supports cross-thread GC. For consistency accross implementations,
+        #   it might be better to just re-allocate/copy in this case.
+        # - Other tuples would require a copy. (This is really a requirement
+        #   for tuples allocated from C code, unless it uses SLUL's arena
+        #   allocation functions in slulrt.)
+        #   
+        LargeValue large
+    }
+    # opaque objects:
+    # - Like objects, but fields (and layout/size) are inaccessible
+    # - Lacks {} and has the layout defined in the impl, just like a function can have it's body in the impl
+    # - Perhaps it should be forbidden to have non-opaque objects in interfaces? It's generally an anti-pattern.
+    type Item = object
+    # tuples:
+    # - The ABI decides when to pass these by ref or value
+    # - Reference comparison operators are not allowed.
+    # - The contents can be compared with the "==" and "!=" operators.
+    # - Tuples can't be opaque/private.
+    type Point = (int x, int y)
+    type Point = (int x, y) # perhaps allow this syntax as well (...and multi-line syntax without comma also)
+    type LargeValue = ([10000]byte buffer)
+    # For type-scoped functions that return an object ("constructors"):
+    # - They implicitly take an arena parameter
+    # - The returned reference is an arena reference
+    func .new() -> Box
+    constructor Box.new()   # maybe type-scoped functions should have this syntax?
+    # Return values in methods have the same lifetime as the object itself
+    # - Should the this parameter be "var"? 
+    # - Should the this parameter be "arena"? 
+    func Box.get_contents() -> Item
+    # Parameters do not implicitly transfer ownership. Inside the callee, the lifetime of "other" ends when the function returns.
+    # - Should the parameter be "arena"? 
+    func Box.equals(Box other) -> bool
+    # Parameters can be marked with "keep" to allow shared ownership
+    func Box.set_contents(keep Item contents)
+    func Box.set_contents!(keep Item contents)    # perhaps there should be a ! for functions that modify the object?
+    func var Box.set_contents(keep Item contents) # or a qualifier like this.
+    # Parameters can be passed as "var"
+    # - if passed as "keep", we need exclusive access (or the item can be marked as aliased)
+    fucn Box.squeeze_item(keep var Item item)
+    fucn Box.squeeze_item(keep aliased var Item item)
+    # To have mulitple outputs from a function, use a tuple as the return value:
+    # - The ABI decides when to pass these by ref (implicit parameter) or value
+    # - Because tuples can't be opaque, the return value could be returned by value.
+    # - Because tuples can't be opaque, the *caller* allocates (on stack) if it's not possible to pass by value.
+    func Box.get_both_items() -> (Item a, Item b)
+
+    # How should function references work?
+    # - What keyword to use when there are no refs?
+    # - Most of the time, you want a context-parameter
+    # - For non-ref (or slot) types, you may want a (reference, length) to process multiple items at once.
+    func Box.process_contents(delegate(Item item) handler)
+
+Should the builtin types use TitleCase names also?
+
+    Probably yes, for consistency.
+
+    String
+    Byte
+    Int16
+
+    Java developers might confuse these with reference types, though :(
+    And worse, as a Java developer, you might start using e.g. Byte
+    where it should be byte in your Java code. That will usually silently
+    compile without any warnings, but can be broken (with == != operators)
+    or slow.
+
+    Solution:
+
+    Use different names:
+    - byte -> UInt8 or U8
+    - int  -> Integer (or even skip this type, and have only fixed-sized Int*/UInt*)
+
+    Regarding the extra finger strain to hold shift and stretching out the
+    finger to push the letter button: That could be solved by having the IDE
+    auto-capitalize the type if it exists and a type is expected at the given
+    location.
+
+
+Should/can there be a arbitrary-sized integer type?
+
+    E.g. allow integers -16384..16383 to be stored directly, and use a
+    reference for larger integers.
+
+    What should it be called?
+
+        num
+        int
+        intn
+        integer
+        BigNum
+        BigInt
+        Num
+        Int
+        IntN
+        Integer
+
+    The compiler could optimize it to a more efficient type if the range
+    is known!
+
+        num i = get_number()  # since it is immutable, we can infer the type from the return value of get_number()
+
+    Maybe it should be possible to specify a range? What syntax to use?
+
+        var num<0..=10> i = 0
+        var num<0 upto 10> i = 0
+        var num i [[0 <= value <= 10]] = 0
+        var num<0-10> i = 0     # but "-" is also the minus operator :(
+        var num<0~10> i = 0
+
+Print function:
+
+    How simple can it be, without creating confusion/problems or hard-coding things?
+
+        out.print("number: {}", .[123])     # array constructor
+        out.print("number: {}", 123)        # safe variant-type var-arg
+        out.print "number: {}", 123         # allowing () to be skipped (in some cases)
+        out "number: {}", 123               # allowing a default function on objects
+        out("number: {}", 123)              # allowing a default function on objects, but without allowing () to be skipped
+        # error handling?
+
+    Input streams could also have a default function.
+    But it would be limited to only reading e.g. a line.
+    (That's probably what iterators should do as well.)
+    What should it do on error?
+
+        string s = in()     # reads a line
author	Samuel Lidén Borell <samuel@kodafritt.se>	2024-06-02 21:20:48 +0200
committer	Samuel Lidén Borell <samuel@kodafritt.se>	2024-06-02 21:20:48 +0200
commit	580bf6130632f6855fddeea7b07c8401c56108f2 (patch)
tree	4bd5e7cdb68408c52ad8df030f7f887c7d97def0 /notes/slul2.txt
parent	db73835b12f41be8766384a1cdcc34a0848354dc (diff)
download	slul-main.tar.gz slul-main.tar.bz2 slul-main.zip