SLUL2 ===== Making SLUL: * easier to use * easier to implement * more future-proof / more portable Desirable changes: * Revise ref - forbidding refs to non-struct/non-array types might enable some optimizations - *removing* explicit refs would definitely have impacts on usability. not sure if good or bad. it makes the proglang more implicit/"magic", which can be a bad thing. * Implicit arenas? - and for mutating methods, use the 4 "allocation variants": (inspired by Vale) 1. placement new (uses arena only for "indirect"/references fields) 2. arena new (allocate in given arena) 3. modify self (in-place modification. Can keep referenced data) 4. discarding self-mod (in-place modification. Cannot keep referenced data) some of these are applicable to constructors also * Garbage-collected areanas? - I.e. local GC - It would remove "anxiety" around memory allocation - Downside 1: It requires meta-data, which isn't needed with plain arena allocation. - Downsides N: The usual downsides with GC * Revise expr integer types What are the use-cases for "non-plain ints"? - length type / ssize/usize - byte/int16 arrays - small/bitsliced fields in structs - fixed-size wrapping uints (e.g. for hash functions). * Move some stuff from hard-coded syntax to code? e.g. like Scheme, REBOL, Nim. - might actually work with statements: - they can only appear inside function bodies - so the toplevels are available, and their types are known. - But it would definitely need inlining to work with reasonable speed. - Perhaps a bad idea after all? * Misc syntax stuff: - Use tabs instead of spaces? But this gets tricky with alignment of e.g. parameters. Control statements defined in library module headers ---------------------------------------------------- These need some kind of analysis of the IR to check varstates (liveness, etc). It also needs to handle nested if-elseif-elseif... So perhaps this is a bad idea? Example: statement "if" Expr cond Statements true_block "else" Statements false_block { cond CONDJUMP FALSE false_block true_block JUMP end false_block end } statement Super-simple proglang --------------------- Only two/three kinds of typedefs. Not allowed as anonymous types (or? it is useful for e.g. return values) record SomeStruct { int field1 OtherType field2 # <--- compiler chooses whether to put in ref or not # this makes FFI trickier. But non-closed types are always refs. # this also makes lifetimes and aliasing trickier. # maybe "var" should not be allowed to alias? } enum SomeEnum { ... } # Maybe some kind of sum/union/variant type record ExprNode { ExprType type int line int column switch type { case .unary case .binary Expr operand_a if type == .binary { Expr operand_b } case .call Expr func_expr int num_args # have a built-in list type? # and choose the best possible representation? # (in this case it's runtime-determined frozen-length, so it could be a pointer to an array. or a full-blown list type) int[num_args] args } } # Maybe some kind of constraints func process_op(ExprNode<.type in (.unary, .binary)> expr) func process_op(ExprNode expr [.unary, .binary]) func process_op(ExprNode(.unary .binary) expr) func process_op(ExprNode<.unary .binary> expr) func ExprNode<.unary .binary>.process_op() func ExprNode.process_op() for (.unary, .binary) func ExprNode.process_op() with (.unary, .binary) func ExprNode.process_op() this in (.unary, .binary) func ExprNode.process_op() given type == .unary or type == .binary Qualifiers for records and enums: record Point closed { # (require a newline here?) int x int y # no more fields can be added. allows some optimizations, such as call-by-value / embedding into structs } enum SubPixel closed { .red .green .blue } Enums can have a base type and/or integral values also (this is mainly useful for FFI) enum StatusByte closed byte { .ready = 10 .running = 20 .failure = 90 } Integer / elementary types: * Perhaps even use variable-size integers? The downside is that += 1 etc. might require allocation. Methods: * Skip "this". But disallow shadowing. Type identifiers * For consistency, always include the "." in typeidentifiers, even in e.g. enum definitions. * Constructors are maybe not that intuitive (can they be improved?): func .new(int a, int b) -> Thing Avoiding punctuation: * Can the . in typeidentifiers be skipped? * Can the () in function calls be skipped? - if the function call fits on one line - (unless a comma is required between them) and the parameters are terms - and the function call is not nested inside a function call, field or index expression. - related: tuples. but that would be ambiguouos if used as function arguments * Can the () in function declarations be skipped? func example int a int b return bool { if a == b { otherfunc a, 123 return true } } Can refs be avoided? # objects: # These are always passed by reference. # References can be compared with "ref_is" or "is" or a similar operator. # The "==" and "!=" operators are not allowed (maybe it should be allowed to implement them? e.g. with a method called "equals"?) type Box = object { # These are references: Item a Item b # Perhaps allow syntax like this: Item a1, b1 Item a1, Item b1 # Regarding tuples: # I think that maybe they CAN be references if it too large to use values :) # - We can require that if the object is mutable, it must also be passed by arena-ref. # - Tuples up to some certain size could be embedded / passed by value # (Check the optimal limit. It's at least the size of two pointers, but it could be larger) # - Tuples allocated in the *same* arena can just be referenced directly! # (this should be fairly simple and fast to check). # - If each thread uses a contiguous virtual-memory block, # then this would be a trivial range check. # - Tuples allocated by the same thread and in SLUL code, can # (as an optional optimization) be referenced if # 1) the lifetime allows it (how to check this at runtime?), or # 2) the runtime uses garbage collection, and can perform GC in # this case. # - Tuples allocated in SLUL code from other threads may or may not # be possible to reference depending on whether the runtime # supports cross-thread GC. For consistency accross implementations, # it might be better to just re-allocate/copy in this case. # - Other tuples would require a copy. (This is really a requirement # for tuples allocated from C code, unless it uses SLUL's arena # allocation functions in slulrt.) # LargeValue large } # opaque objects: # - Like objects, but fields (and layout/size) are inaccessible # - Lacks {} and has the layout defined in the impl, just like a function can have it's body in the impl # - Perhaps it should be forbidden to have non-opaque objects in interfaces? It's generally an anti-pattern. type Item = object # tuples: # - The ABI decides when to pass these by ref or value # - Reference comparison operators are not allowed. # - The contents can be compared with the "==" and "!=" operators. # - Tuples can't be opaque/private. type Point = (int x, int y) type Point = (int x, y) # perhaps allow this syntax as well (...and multi-line syntax without comma also) type LargeValue = ([10000]byte buffer) # For type-scoped functions that return an object ("constructors"): # - They implicitly take an arena parameter # - The returned reference is an arena reference func .new() -> Box constructor Box.new() # maybe type-scoped functions should have this syntax? # Return values in methods have the same lifetime as the object itself # - Should the this parameter be "var"? # - Should the this parameter be "arena"? func Box.get_contents() -> Item # Parameters do not implicitly transfer ownership. Inside the callee, the lifetime of "other" ends when the function returns. # - Should the parameter be "arena"? func Box.equals(Box other) -> bool # Parameters can be marked with "keep" to allow shared ownership func Box.set_contents(keep Item contents) func Box.set_contents!(keep Item contents) # perhaps there should be a ! for functions that modify the object? func var Box.set_contents(keep Item contents) # or a qualifier like this. # Parameters can be passed as "var" # - if passed as "keep", we need exclusive access (or the item can be marked as aliased) fucn Box.squeeze_item(keep var Item item) fucn Box.squeeze_item(keep aliased var Item item) # To have mulitple outputs from a function, use a tuple as the return value: # - The ABI decides when to pass these by ref (implicit parameter) or value # - Because tuples can't be opaque, the return value could be returned by value. # - Because tuples can't be opaque, the *caller* allocates (on stack) if it's not possible to pass by value. func Box.get_both_items() -> (Item a, Item b) # How should function references work? # - What keyword to use when there are no refs? # - Most of the time, you want a context-parameter # - For non-ref (or slot) types, you may want a (reference, length) to process multiple items at once. func Box.process_contents(delegate(Item item) handler) Should the builtin types use TitleCase names also? Probably yes, for consistency. String Byte Int16 Java developers might confuse these with reference types, though :( And worse, as a Java developer, you might start using e.g. Byte where it should be byte in your Java code. That will usually silently compile without any warnings, but can be broken (with == != operators) or slow. Solution: Use different names: - byte -> UInt8 or U8 - int -> Integer (or even skip this type, and have only fixed-sized Int*/UInt*) Regarding the extra finger strain to hold shift and stretching out the finger to push the letter button: That could be solved by having the IDE auto-capitalize the type if it exists and a type is expected at the given location. Should/can there be a arbitrary-sized integer type? E.g. allow integers -16384..16383 to be stored directly, and use a reference for larger integers. What should it be called? num int intn integer BigNum BigInt Num Int IntN Integer The compiler could optimize it to a more efficient type if the range is known! num i = get_number() # since it is immutable, we can infer the type from the return value of get_number() Maybe it should be possible to specify a range? What syntax to use? var num<0..=10> i = 0 var num<0 upto 10> i = 0 var num i [[0 <= value <= 10]] = 0 var num<0-10> i = 0 # but "-" is also the minus operator :( var num<0~10> i = 0 Print function: How simple can it be, without creating confusion/problems or hard-coding things? out.print("number: {}", .[123]) # array constructor out.print("number: {}", 123) # safe variant-type var-arg out.print "number: {}", 123 # allowing () to be skipped (in some cases) out "number: {}", 123 # allowing a default function on objects out("number: {}", 123) # allowing a default function on objects, but without allowing () to be skipped # error handling? Input streams could also have a default function. But it would be limited to only reading e.g. a line. (That's probably what iterators should do as well.) What should it do on error? string s = in() # reads a line