aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/slul2.txt
diff options
context:
space:
mode:
authorSamuel Lidén Borell <samuel@kodafritt.se>2024-06-02 21:20:48 +0200
committerSamuel Lidén Borell <samuel@kodafritt.se>2024-06-02 21:20:48 +0200
commit580bf6130632f6855fddeea7b07c8401c56108f2 (patch)
tree4bd5e7cdb68408c52ad8df030f7f887c7d97def0 /notes/slul2.txt
parentdb73835b12f41be8766384a1cdcc34a0848354dc (diff)
downloadslul-main.tar.gz
slul-main.tar.bz2
slul-main.zip
Notes: Usability, references, numeric types / comparisons, etc.HEADmain
Diffstat (limited to 'notes/slul2.txt')
-rw-r--r--notes/slul2.txt340
1 files changed, 340 insertions, 0 deletions
diff --git a/notes/slul2.txt b/notes/slul2.txt
new file mode 100644
index 0000000..9d103d0
--- /dev/null
+++ b/notes/slul2.txt
@@ -0,0 +1,340 @@
+SLUL2
+=====
+
+Making SLUL:
+* easier to use
+* easier to implement
+* more future-proof / more portable
+
+Desirable changes:
+
+* Revise ref
+ - forbidding refs to non-struct/non-array types might enable some optimizations
+ - *removing* explicit refs would definitely have impacts on usability.
+ not sure if good or bad. it makes the proglang more implicit/"magic",
+ which can be a bad thing.
+* Implicit arenas?
+ - and for mutating methods, use the 4 "allocation variants": (inspired by Vale)
+ 1. placement new (uses arena only for "indirect"/references fields)
+ 2. arena new (allocate in given arena)
+ 3. modify self (in-place modification. Can keep referenced data)
+ 4. discarding self-mod (in-place modification. Cannot keep referenced data)
+ some of these are applicable to constructors also
+* Garbage-collected areanas?
+ - I.e. local GC
+ - It would remove "anxiety" around memory allocation
+ - Downside 1: It requires meta-data, which isn't needed with plain
+ arena allocation.
+ - Downsides N: The usual downsides with GC
+* Revise expr integer types
+ What are the use-cases for "non-plain ints"?
+ - length type / ssize/usize
+ - byte/int16 arrays
+ - small/bitsliced fields in structs
+ - fixed-size wrapping uints (e.g. for hash functions).
+* Move some stuff from hard-coded syntax to code? e.g. like Scheme, REBOL, Nim.
+ - might actually work with statements:
+ - they can only appear inside function bodies
+ - so the toplevels are available, and their types are known.
+ - But it would definitely need inlining to work with reasonable speed.
+ - Perhaps a bad idea after all?
+* Misc syntax stuff:
+ - Use tabs instead of spaces?
+ But this gets tricky with alignment of e.g. parameters.
+
+
+Control statements defined in library module headers
+----------------------------------------------------
+
+These need some kind of analysis of the IR to check varstates (liveness, etc).
+It also needs to handle nested if-elseif-elseif...
+
+So perhaps this is a bad idea?
+
+Example:
+
+ statement "if" Expr cond Statements true_block "else" Statements false_block
+ {
+ cond
+ CONDJUMP FALSE false_block
+ true_block
+ JUMP end
+ false_block
+ end
+ }
+ statement
+
+Super-simple proglang
+---------------------
+
+Only two/three kinds of typedefs. Not allowed as anonymous types (or? it is useful for e.g. return values)
+
+ record SomeStruct {
+ int field1
+ OtherType field2 # <--- compiler chooses whether to put in ref or not
+ # this makes FFI trickier. But non-closed types are always refs.
+ # this also makes lifetimes and aliasing trickier.
+ # maybe "var" should not be allowed to alias?
+ }
+
+ enum SomeEnum {
+ ...
+ }
+
+ # Maybe some kind of sum/union/variant type
+ record ExprNode {
+ ExprType type
+ int line
+ int column
+ switch type {
+ case .unary
+ case .binary
+ Expr operand_a
+ if type == .binary {
+ Expr operand_b
+ }
+ case .call
+ Expr func_expr
+ int num_args
+ # have a built-in list type?
+ # and choose the best possible representation?
+ # (in this case it's runtime-determined frozen-length, so it could be a pointer to an array. or a full-blown list type)
+ int[num_args] args
+ }
+ }
+
+ # Maybe some kind of constraints
+ func process_op(ExprNode<.type in (.unary, .binary)> expr)
+ func process_op(ExprNode expr [.unary, .binary])
+ func process_op(ExprNode(.unary .binary) expr)
+ func process_op(ExprNode<.unary .binary> expr)
+ func ExprNode<.unary .binary>.process_op()
+ func ExprNode.process_op()
+ for (.unary, .binary)
+ func ExprNode.process_op()
+ with (.unary, .binary)
+ func ExprNode.process_op()
+ this in (.unary, .binary)
+ func ExprNode.process_op()
+ given type == .unary or type == .binary
+
+Qualifiers for records and enums:
+
+ record Point closed { # (require a newline here?)
+ int x
+ int y
+ # no more fields can be added. allows some optimizations, such as call-by-value / embedding into structs
+ }
+
+ enum SubPixel closed {
+ .red
+ .green
+ .blue
+ }
+
+Enums can have a base type and/or integral values also
+(this is mainly useful for FFI)
+
+ enum StatusByte closed byte {
+ .ready = 10
+ .running = 20
+ .failure = 90
+ }
+
+Integer / elementary types:
+
+* Perhaps even use variable-size integers?
+ The downside is that += 1 etc. might require allocation.
+
+Methods:
+
+* Skip "this". But disallow shadowing.
+
+Type identifiers
+
+* For consistency, always include the "." in typeidentifiers, even in
+ e.g. enum definitions.
+* Constructors are maybe not that intuitive (can they be improved?):
+
+ func .new(int a, int b) -> Thing
+
+
+Avoiding punctuation:
+
+* Can the . in typeidentifiers be skipped?
+* Can the () in function calls be skipped?
+ - if the function call fits on one line
+ - (unless a comma is required between them) and the parameters are terms
+ - and the function call is not nested inside
+ a function call, field or index expression.
+ - related: tuples. but that would be ambiguouos if used as function arguments
+* Can the () in function declarations be skipped?
+
+ func example
+ int a
+ int b
+ return bool
+ {
+ if a == b {
+ otherfunc a, 123
+ return true
+ }
+
+ }
+
+Can refs be avoided?
+
+ # objects:
+ # These are always passed by reference.
+ # References can be compared with "ref_is" or "is" or a similar operator.
+ # The "==" and "!=" operators are not allowed (maybe it should be allowed to implement them? e.g. with a method called "equals"?)
+ type Box = object {
+ # These are references:
+ Item a
+ Item b
+ # Perhaps allow syntax like this:
+ Item a1, b1
+ Item a1, Item b1
+ # Regarding tuples:
+ # I think that maybe they CAN be references if it too large to use values :)
+ # - We can require that if the object is mutable, it must also be passed by arena-ref.
+ # - Tuples up to some certain size could be embedded / passed by value
+ # (Check the optimal limit. It's at least the size of two pointers, but it could be larger)
+ # - Tuples allocated in the *same* arena can just be referenced directly!
+ # (this should be fairly simple and fast to check).
+ # - If each thread uses a contiguous virtual-memory block,
+ # then this would be a trivial range check.
+ # - Tuples allocated by the same thread and in SLUL code, can
+ # (as an optional optimization) be referenced if
+ # 1) the lifetime allows it (how to check this at runtime?), or
+ # 2) the runtime uses garbage collection, and can perform GC in
+ # this case.
+ # - Tuples allocated in SLUL code from other threads may or may not
+ # be possible to reference depending on whether the runtime
+ # supports cross-thread GC. For consistency accross implementations,
+ # it might be better to just re-allocate/copy in this case.
+ # - Other tuples would require a copy. (This is really a requirement
+ # for tuples allocated from C code, unless it uses SLUL's arena
+ # allocation functions in slulrt.)
+ #
+ LargeValue large
+ }
+ # opaque objects:
+ # - Like objects, but fields (and layout/size) are inaccessible
+ # - Lacks {} and has the layout defined in the impl, just like a function can have it's body in the impl
+ # - Perhaps it should be forbidden to have non-opaque objects in interfaces? It's generally an anti-pattern.
+ type Item = object
+ # tuples:
+ # - The ABI decides when to pass these by ref or value
+ # - Reference comparison operators are not allowed.
+ # - The contents can be compared with the "==" and "!=" operators.
+ # - Tuples can't be opaque/private.
+ type Point = (int x, int y)
+ type Point = (int x, y) # perhaps allow this syntax as well (...and multi-line syntax without comma also)
+ type LargeValue = ([10000]byte buffer)
+ # For type-scoped functions that return an object ("constructors"):
+ # - They implicitly take an arena parameter
+ # - The returned reference is an arena reference
+ func .new() -> Box
+ constructor Box.new() # maybe type-scoped functions should have this syntax?
+ # Return values in methods have the same lifetime as the object itself
+ # - Should the this parameter be "var"?
+ # - Should the this parameter be "arena"?
+ func Box.get_contents() -> Item
+ # Parameters do not implicitly transfer ownership. Inside the callee, the lifetime of "other" ends when the function returns.
+ # - Should the parameter be "arena"?
+ func Box.equals(Box other) -> bool
+ # Parameters can be marked with "keep" to allow shared ownership
+ func Box.set_contents(keep Item contents)
+ func Box.set_contents!(keep Item contents) # perhaps there should be a ! for functions that modify the object?
+ func var Box.set_contents(keep Item contents) # or a qualifier like this.
+ # Parameters can be passed as "var"
+ # - if passed as "keep", we need exclusive access (or the item can be marked as aliased)
+ fucn Box.squeeze_item(keep var Item item)
+ fucn Box.squeeze_item(keep aliased var Item item)
+ # To have mulitple outputs from a function, use a tuple as the return value:
+ # - The ABI decides when to pass these by ref (implicit parameter) or value
+ # - Because tuples can't be opaque, the return value could be returned by value.
+ # - Because tuples can't be opaque, the *caller* allocates (on stack) if it's not possible to pass by value.
+ func Box.get_both_items() -> (Item a, Item b)
+
+ # How should function references work?
+ # - What keyword to use when there are no refs?
+ # - Most of the time, you want a context-parameter
+ # - For non-ref (or slot) types, you may want a (reference, length) to process multiple items at once.
+ func Box.process_contents(delegate(Item item) handler)
+
+Should the builtin types use TitleCase names also?
+
+ Probably yes, for consistency.
+
+ String
+ Byte
+ Int16
+
+ Java developers might confuse these with reference types, though :(
+ And worse, as a Java developer, you might start using e.g. Byte
+ where it should be byte in your Java code. That will usually silently
+ compile without any warnings, but can be broken (with == != operators)
+ or slow.
+
+ Solution:
+
+ Use different names:
+ - byte -> UInt8 or U8
+ - int -> Integer (or even skip this type, and have only fixed-sized Int*/UInt*)
+
+ Regarding the extra finger strain to hold shift and stretching out the
+ finger to push the letter button: That could be solved by having the IDE
+ auto-capitalize the type if it exists and a type is expected at the given
+ location.
+
+
+Should/can there be a arbitrary-sized integer type?
+
+ E.g. allow integers -16384..16383 to be stored directly, and use a
+ reference for larger integers.
+
+ What should it be called?
+
+ num
+ int
+ intn
+ integer
+ BigNum
+ BigInt
+ Num
+ Int
+ IntN
+ Integer
+
+ The compiler could optimize it to a more efficient type if the range
+ is known!
+
+ num i = get_number() # since it is immutable, we can infer the type from the return value of get_number()
+
+ Maybe it should be possible to specify a range? What syntax to use?
+
+ var num<0..=10> i = 0
+ var num<0 upto 10> i = 0
+ var num i [[0 <= value <= 10]] = 0
+ var num<0-10> i = 0 # but "-" is also the minus operator :(
+ var num<0~10> i = 0
+
+Print function:
+
+ How simple can it be, without creating confusion/problems or hard-coding things?
+
+ out.print("number: {}", .[123]) # array constructor
+ out.print("number: {}", 123) # safe variant-type var-arg
+ out.print "number: {}", 123 # allowing () to be skipped (in some cases)
+ out "number: {}", 123 # allowing a default function on objects
+ out("number: {}", 123) # allowing a default function on objects, but without allowing () to be skipped
+ # error handling?
+
+ Input streams could also have a default function.
+ But it would be limited to only reading e.g. a line.
+ (That's probably what iterators should do as well.)
+ What should it do on error?
+
+ string s = in() # reads a line