Misc usability improvements =========================== Old notes regarding simplification of parameter/field definitions ----------------------------------------------------------------- Alt. 1: * Should "ref" in parameters be automatic? - That would allow small constant items (e.g. 1 or 2 machine words) to be passed by register - Local variable syntax? Should ref be skipped there as well? (that makes parsing trickier - perhaps "struct X" syntax should be used?) - For structs it is a more difficult question! * Which typedefs should be allowed, and how should typedefs be referenced? - struct X / enum X / num X ? Alt. 2: * Use Pascal-style definitions? - This would simplify type parsing - BUT what syntax to use for qualfiers? var x: int (Pascal style) x: var int (more logical) - With long types, it could make definitions with initializations harder to read: x: SomeLongTypeHere = xyz - It also makes definitions stand out less, since there is not necessarilly any keyword at the start of the line. [DONE] Simplified declarations/definitions ------------------------------------------ Remove "-> void" func f() -> int func g() # no return value Remove "= private" in typedecls! Why? There is an asymetry in top-level declarations: private non-private ------- ----------- int x int x = int func f() func f() { ... } type T = private type T = ... Without "= private": type T type T = ... Simplified arena passing ------------------------ Instead of trying to pass around arenas everywhere and manage lifetimes (which should still be supported), it may be easier to use special arena parameters: func f(arena) -> arena Thing func f(arena, ref Thing t) -> arena OtherThing Maybe this also makes more sense with regards to exception handling? - YES: simpler - NO: all ACCESIBLE arenas still have to be destroyed But... Wouldn't it be necesary to destroy any arena with any item inside that can be reached through a "var" (or "writeonly") view? Simplified reference syntax --------------------------- "arena var", "ref threaded var" etc. are quite long. Can the syntax be shortened? Also, what should the "arena" keyword be called? - arena - region - partition - section - sector - obj - mem - sys - access - token Another way would be to make "var" imply "arena". I.e: - "ref X" = ref to constant X - "var X" = arena-pointer to X ... BUT isn't there also a case when something is mutable but not arena-allocated? (but if everything [except constants] is arena-allocated, then it is fine) (this is probably a good idea!) Does "arena" without "var" make sense? - This depends on wheter one always would want to have a separate "(arena," parameter. - ... BUT what about local variables? should "var" imply "arena" (and ref)" there!?? - but we could require an explicit "stack" keyword for stack allocation - ... BUT what about stack-allocated data? That can be "var" but NOT "arena"! - Follow up issue: Should there be garbage-collected arenas? (And if so, should incoming refs be handled? i.e. "roots" as seen from "inside" the arena) Do implicit pointer parameters make sense? - Yes, when only considering parameters. - ...and similarly, global data would be "implicitly" non-pointers. - BUT how about local variables? - "var" and "ref" are not very visually distinctive. - ALSO: "var" and "ref var" and "var ref" are NOT the same things! Reduce the number of types? --------------------------- Currently: Scalars: References: Vectors: bool ref []... byte arena list<...> intN own string uintN map<...> wuintN Compound: float struct ... enum ... union ... Merge []... and list<...> ? - Fixed-size vs growable - Known-size vs unknown-size Change syntax of list? - [dynamic]int - [...]int - ...BUT these are internally some kind of "fat pointer" type - ...BUT how about covariance/contravariance? (if inheritance or interfaces is added) - ...BUT passing a [N]E to a [dynamic]E requires creation of a "fat pointer" Renames? - wuintN -> wrapN Merge/simplify struct/union/"class"? What would "super-easy-lang" have? - bool - num = intN/uintN - float - enum - ref - struct/class - array/map - string Related: Use range types instead of different number types? ----------------------------------------------------------- e.g. int16 = int range -32768..32767 uint16 = int range 0..65535 The default range for int could be -2^32..(2^32-1) Problem 1: Types with system dependent sizes: - Mainly "size" and "usize" types (but also int which can be 16 bit on embedded platforms) --> We need two ranges: minimum and maximum (...maximum = 0..2^64-1) In most cases those will be the same. Problem 2: Declarations need to start with a keyword/symbol. --> Keep uint, intN, uintN as keywords! (but remove wuint/wuintN?) Alternative syntax: type int16 = int<-32768..32767> This could also be extended with "overflow modes": - bounds checked (default) - wrapping - saturating - invalidating (e.g. a special reserved NaN/sentinel value. For this reason, it can't use the full range) Related: Use "wraparound" operation instead of wuint? ----------------------------------------------------- Instead of: wuint x = y + z it could be written as: uint x = wraparound(y + z) it can also be extended with saturating arithmetic (decide on keyword name): uint x = saturate(y + z) uint x = limit(y + z) uint x = cap(y + z) Disadvantage: Risk of confusion(?): uint x = wraparound(f(a + b, c) + y) Neither a+b nor f() will be computed with wraparound. Also: It needs a "nowraparound" keyword also. Misc. simplifications --------------------- [DONE] * Don't use typescopes (.true/.false) for bool, and instead use keywords (true/false). * Remove "ref"/"enum" from type params? (i.e. write "List" instead of "List") * Change struct value syntax from (x,y) to [x,y] - How about named fields? * Change "else if" into "elif"? Reducing error-prone-ness ------------------------- Things to improve: * Accidental string concatenations: ["a" "b"] vs ["a", "b"] - Use a concatenation operator? E.g. | & + . ++ - Allow it only in the last parameter/element? - Use a "trailer" character? E.g. "a"> "b" or "a"- "b" or "a"... "b" - Have a different trailer character that also adds a newline? * Forbid == when both sides are option types (and neither is none). That is most likely an error. There could be some special syntax, perhaps ?==? or "x or none == y or none"? * Add a default syntax: string name = get_name() default "Unknown" - should it apply to all components inside the expression? For example: string name = get_users().get_logged_in_user().get_profile().get_name() default "Unknown" * Change the range operator into two range operators (like in Odin) ..= inclusive range ..< exclusive range (perhaps also allow these in "case" values?) Reducing cognitive load ----------------------- * File-local identifiers? E.g. "local func f()" * Enforce prefixes in exported symbols? \prefixes type=Somelib func=somelib_ data=somelib_ (default: use module name. capitalzed but otherwise case-insensitive for types, and lowercase and suffixed by "_" for func/data) Removing ref? And if so, where / how much? ------------------------------- Where? * structs (bad idea?) * function params * local variables * type parameters (simple / no problems) Basic solution: - Always pass arena/own refs by reference - Pass private types to public functions by reference - Pass private types to functions in other modules by reference - Pass small const items by value (const = immutable and not shared) - Pass other items by reference Better solution: - Have a "ref" (and "arena"/"own" keywords) and a "stack" keyword - With no keyword, the compiler decides (based on the public function definition only, except for internal functions) - With no keyword, it is NOT possible to specify a lifetime - For structs, a "ref" or "include" keyword is mandatory Problem: How should the equality operation work? - file1 == file2: In this case the user probably wants to compare by reference (which is in fact the only possibility). - point1 == point2: In this case the user probably wants to compare by contents (which might be the only possibility, if the points are passed by value) Solution 1: - Compare all private types by reference (even in the same module), and all others by value. Solution 2: - Use two separate keywords for structs: "record"(?) and "object"(?) - "object"s are compared by reference (but might still be passed by value in internal functions if it does not change behavior) - "record"s are compared by value (but might still be passed by reference). records might get copied instead of referenced when assinging them or otherwise passing them around. - "record"s are compared with deep comparison (the module defining the record also contains an invisible comparison function, if the record is not closed, or if the SLUL ABI for the platform requires it [e.g. for large structs]) - records passed as "var" or "writeonly" (or "aliased" or "threaded") are always passed by reference. - private types can only be "object"s. - (maybe:) "record"s cannot contain "object"s. (no) Problem 2: How to handle option types? - References can be optional, but pure values cannot. - Should ? automatically create a ref? (Except for very small types perhaps?) - If so, should "var ? T" be forbidden? "Problem" 3: Private types - Types without a decl (in interfaces) can only refer to "object"s. Problem 4: types in arrays (and structs) - Is this an array of refs to records, records, refs to objects or objects: [3]Obj For structs there are several cases: type Something = struct { int x ref BaseObj base ref? Something next include Point point } Keyword: include, inline, copy, value, data, ...? Problems with the basic solution: * there is too much magic. f(small) vs f(large) should appear to work the same, but doesn't if f() keeps the object. * interaction with lifetime conditions. Removing "ref" from local variables ----------------------------------- This means that both declarations and expressions can start with an identifier. So we need to disambiguate these two cases from each other. cslul already looks at the character following the identifier (to distinguish versioned idents from non-versioned ones). But this is not enough: T x = 0 i = 0 ^-- character after identifier Solutions: 1. Copy the identifier name (slows down parsing) 2. Require types to start with an uppercase letter, and data/functions to start with a lowercase letter. 3. Skip whitespace after identifiers, and then check for a following character. Following characters and result: ! --- % --- ) --- - EXPR ; --- ? EXPR ] --- | --- " --- & --- * EXPR . EXPR < TYPE @ --- ^ --- } --- # --- ' --- + EXPR / EXPR = EXPR [ EXPR ` --- ~ --- $ --- ( EXPR , --- : --- > --- \ --- { --- --> (*+-/.=?[ EXPR < TYPE abcABC_ TYPE EOL --- SPACE continue Combined with lower/uppercase for distinction between objects and data: File f = .open(arena, "abc.txt") point p = [1, 2] Alternative solution 1: (instead of ref) object File f = .open(arena, "abc.txt") data Point p = [1, 2] int i = 123 How to name the keywords? obj data object value Alternative solution 2: Sigills @File f = .open(arena, "abc.txt") $Point p = [1, 2] File @f = .open(arena, "abc.txt") Point $p = [1, 2] Alternative solution 3: f: File = .open(arena, "abc.txt") p: point = [1, 2] i: int = 123 Should "value types" be able to be "aliased" or "threaded"? ----------------------------------------------------------- Perhaps not? Or perhaps only inside objects? (i.e. not records) And which types should it apply to: - int, bool, etc? - arrays? - records? - strings? Note that strings are immutable and work as if they were record types, but the pointer (or embedded 1-character string) could change. Ambiguity with var and implicit ref ----------------------------------- With implicit ref, what should the following mean? type SomeRecord = record { ... } type SomeObject = object { ... } func f() { var SomeRecord rec var SomeObject obj } For "rec", it is obvious that the contents of the record should be modifiable (after all, it is a kind of "value type"). For "obj" it is not clear what is meant with "var": 1. Is the object itself modifiable? 2. Can the reference be modified to point to some other object? For consistency with record types, 1. might be a better choice. To get 2., "var ref" syntax could be used, and to get both, "var ref var" could be used. Another option is to use separate keywords for the two cases (which should be both short AND common/simple/non-academic words or abbreviations). Syntax examples (with "var ref"): var SomeObject # modifiable object var ref SomeObject # modifiable reference var ref var SomeObject # modifiable reference to modifiable object (forbid unnecessary "ref", such as "ref var", to avoid mistakes?) Ambiguity (for user) between "object"s and "record"s ---------------------------------------------------- "object"s and "record"s have different semantics (reference vs value). But this is not visible at the place where the objects/records are used: var Something s var Other o It is not clear what "s" and "o" are. Solutions: 1. Require that record types start with a lowercase letter, and that objects start with an uppercase letter? type point = record { ... } type File = object { ... } 2. Require objects to be used by reference, and records as data. ref File f record Point p #ref aliased Point p <-- should this be allowed? #ref int ip <-- and this? Avoid array copies ------------------ Problem: [1024]byte arr1 = ... [1024]byte arr2 = ... [1024]byte which_arr = arr1 # oops, this copies Solution 1: Require a "copy" or "ref" keyword after the "="? [1024]byte arr1 = ... [1024]byte arr2 = ... [1024]byte which_arr = ref arr1 Solution 2: Let the compiler decide (best solution?), but require an explicit "copy" keyword if referencing is not possible? (e.g. due to the arrays being "var") [1024]byte arr1 = ... [1024]byte arr2 = ... [1024]byte which_arr = arr1 Keywords -------- Loops: - loopend/loopempty are a bit too similar visually - should there be a "loopbreak" also? Assert: - many languages either let assert be disabled, or disable it by default - perhaps "require" is a better name? Maybe allow type inference in some cases? ----------------------------------------- Type inference can make code harder to understand if not used carefully, but it can also remove redundancy. ### Solution 1: Maybe SLUL should allow type inference if the variable is given the same name as the type? This could remove a bit of redundancy (but there are also cases where it makes the source code longer) and should not cause any confusion. For example: # before var File file = .open_read("abc.txt") var SomeThing st = .new() # after var file = .open(...) var some_thing = .new() Problems: - This requires that no two types have the same case normalized name. - Types in CamelCase need to work with snake_case variables. - It won't work when combing generics and typescopes: var list = .new() # which type??? Also: - Compared to one-letter variables, the source code actually gets longer in many cases. Compare these: # with one-letter identifier and no type inference: var File f = .open_read("abc.txt") string s = f.read_line() # with this, with type inference: var file = .open_read("abc.txt") string s = file.read_line() - There are other ways to shorten thing, without inference. For example, for construction of objects: # C++ style var FileReader fr("abc.txt") # Alternative style 1 (might not work so well with long argument lists) var FileReader("abc.txt") fr # Alternative style 2 var FileReader fr = .("abc.txt") # Current style (with type scope) var FileReader fr = .open("abc.txt") ### Solution 2: Allow an abbreviated form of variables (per-project? or per-file?) \slul 0.0.0 ... \abbrev M Matrix \source file1.slul This could also be generalized to handle any form of renames, including renames to resolve name conflicts. \slul 0.0.0 ... \depends fastlist 1.0 \depends biglist 1.0 ... \rename fastlist:List FL \rename biglist:List BL \source file1.slul ### Solution 3: Allow "recently used" variables to be abbreviated: SomeList> list = .new() SL> other_list = .new() ### Solution 4: Allow local abbreviations: abbrev SList SomeList abbrev NestedTC NestedThingContainer abbrev ElemWLN ElementWithLongName SList> list = .new() SList> other_list = .new() [DONE] Problem with @ for versions / Problem with long declarations ------------------------------------------------------------ A lot of "web software" try to obfuscate or hide e-mails. That could cause problems with @ in versioned identifiers: ident@1.0 Also, it can get quite long with classes and type params: func SomeQuiteLongType.do_some_kind_of_thing@1.23.4.56(arena, SomeParameterType some_parameter_name) -> SomeReturnType lifetime some_parameter_name >= return Alternative solution 1: Use "ugly but functional and usable" syntax: func do_some_kind_of_thing @1.23.4.546 in SomeQuiteLongType arena param SomeParameterType some_parameter_name return SomeReturnType lifetime some_parameter_name >= return Alternative solution 2: Spacing AND abbreviations abbrev SQ = SomeQuiteLongType abbrev SP = SomeParameterType abbrev SR = SomeReturnType abbrev K = Key abbrev VI = InnerValue abbrev VO = OuterValue func SQ.do_some_kind_of_thing @1.23.4.56(arena, SP some_parameter_name) -> SR lifetime some_parameter_name >= return Alternative solution 3: Only spacing and naming convention func SomeQuiteLongType.do_some_kind_of_thing @1.23.4.56 (arena, SomeParameterType some_parameter_name) -> SomeReturnType lifetime some_parameter_name >= return Alternative solution 4: Modified function syntax, spacing and naming convention func SomeQuiteLongType.do_some_kind_of_thing @1.23.4.56 arena param SomeParameterType some_parameter_name return SomeReturnType lifetime some_parameter_name >= return [DONE] Shorten the module header for simple apps ------------------------------------------------ Currently, the minimal module header for hello world is: \slul 0.0.0 \name helloworld \type cli This should be shorter! * Merge the types cli/gui/gui_cli into a single "app" module type, and make it the default. That way, \type becomes optional. * Make \name optional for apps. It could default to "app". For apps, it's only used for determining the output filename anyway. That leaves us with: \slul 0.0.0 Keep hello world short ---------------------- \slul 0.0.0 func SlulApp.main() -> SlulExitStatus { this.writeln("Hello world") return .ok } The SlulApp class (in slulrt) should contain some basic println/writeln method that covers the most basic use cases. I.e. without adding any \depends lines. Perhaps some interactive functions that can work either in a GUI or in a CLI: prompt(string message) -> string confirm(string message) -> bool alert(string message) die(string message) (And these should be overrideable by e.g. a GUI framework) - On GNU/Linux, it might be possible to distinguish terminal from GUI by checking TERM=linux (or absent/blank) and any of the following: XDG_SESSION_TYPE=(wayland|x11) DESKTOP_SESSION DISPLAY WAYLAND_DISPLAY XAUTHORITY This works on Wayfire at least. - When there is neither console output available nor any GUI, it could write to the syslog (on Linux) or use ReportEvent (on Windows). This works for die(), but not for the others. - Any escape characters should be turned into replacement characters. This prevents both security and portability issues. - Similarly, strange Unicode characters (RTL, control, etc) should be replaced with replacement characters. Avoid special characters: Module headers ---------------------------------------- Instead of \ for module headers: \slul 0.0.0 \name test \version 0.1.0 It could be a ":" at the end: slul: 0.0.0 name: test version: 0.1.0 (Maybe some module headers should be renamed to better work as "attribute:" rather than "\directive") Avoid special characters: Type identifiers? ------------------------------------------- Can this be done at all? Is it a good idea? Currently: ref Thing t = .new(arena) t.set_type(.a) t.set_flags(.visible=true, .enabled=false) Could/should the dots be skipped? ref Thing t = .new(arena) t.set_type(.a) t.set_flags(.visible=true, .enabled=false) arena-refs vs non-arena refs ---------------------------- This could be confusing, because no major language has arenas. ref Thing t1 = ... arena Thing t2 = ... Add tuple type and disallow it to contain certain types? -------------------------------------------------------- A tuple type could be useful for e.g. multiple return values: func Thing.do_stuff() -> (int x, int y) Tuples: * Can be initialized with or without (e.g. (1,0)) field names * Can be compared * Can't contain funcrefs * Can't contain structs directly * Can't contain arrays of funcrefs/structs Structs: * Can only be initialized with field names, e.g. (.x=1,.y=0) * Can't be compared?