SLUL - Safe Lightweight Usable Language ======================================= This is a work in progress! Goals ----- * **Safe**: Memory safety. Capabilities. No loopholes or `unsafe`. * **Lightweight**: Small and efficient native programs. Fast compilation. * **Usable**: Make it easy to read, learn and use. * **Modularity first**: Enforced API compatibility. Only *interfaces* of *direct* dependencies needed for compilation. * **No Surprises**: Make actions and semantics explicit. And of course combinations of these. E.g. memory safety should still be guaranteed even if dependencies are upgraded to newer versions. Target audience --------------- Free/Open Source software (FOSS) developers who want a **safe** and **simple** programming language that works with **low-end** devices (e.g. single-board computers, old hardware, etc.). Sample code ----------- Not working yet! \slul 0.0.0 \depends slulrt 0.0.0 \depends thing 0.2 func arena SlulApp.main() -> SlulExitStatus { ref var Thing t = .new(this) if t.get_color() == .red { t.paint(.color = .green) } return .success } Language taxonomy ----------------- Note: The language is very far from finished, and there is not even a working compiler! * **Static Type System**. Without implicit conversions of values. Nominal typing. * **Procedural Language**. With modules and limited Object Orientation; data types can have constructors and methods. (Some kind of interface type might be added also.) * **Immutable by default**. No mutable global state. Functions can only access their parameters, which are constant by default. * **No Micro Management**. Automatic dereferencing. Simple memory management through usage of `arena`s. Freeing an arena free's all of it's memory and resources. * **Explicit Resource Management**. Coarse-grained, safe, management of resources using arenas. No implicit dropping of resources. No garbage collection or refcounting. * **Sandboxing using Capabilities**. Arenas will serve as sandboxing/capabilities systems as well. Code will only be able to access the system via arenas. * **Isolated Non-Recoverable Exceptions**. An exception only affects reachable arenas in the call-chain. Those become invalid and get free'd. Isolation of call-chains will be possible. * **Enforced Module Compatibility**. Strict API/ABI Compatibility is compiler-enforced, via hash values of interface versions. * **C-compatible ABI**. SLUL code can call and be called by C code, but additional code needs to be written to check capabilities (or set up `seccomp` or similar). * **Simple Syntax**. Reduced punctuation, verbose semantics, standard {} blocks. * **Value Inference**. `uint64 v = 123` instead of `let v = 123u64`. * **Namespace Inference**. `Color c = .rgb(255,0,0)` instead of `let c = Color(255,0,0)`. * **Generic Typing**. Restricted to references and word-sized objects. Type erasure is used, so generic types support dynamic linking and do not cause additional machine code to be generated. * **Restricted Null**. `ref` cannot be `none`, while `?ref` can. * **No Significant Whitespace**, but incorrect indentation can generate warnings. Line breaks are significant, in some places, though. Safety features (The "S" in SLUL) --------------------------------- **NOTE: Most things in this section are not yet checked.** Memory safety will be enforced by the compiler: * `ref`/`arena`/`own` keywords are used to specify basic lifetimes. And most code will be able to use arenas to avoid complex micro-management of memory (which could make SLUL faster than C in some cases). * The `lifetime` keyword can be used to specify more complex lifetimes of function parameters, such as `lifetime a >= return`. * Unintialized data will not be allowed to be used. API stability will be enforced by the compiler (and loader): * Module versions are hashed, which prevents accidental modification. * Hashes will be included in libraries (as one symbol per version), and will be linked against, to be able to detect if the wrong library version is installed. * Module dependencies can safely depend on different (stable) versions of the same module, without any issues. Principle of least privilege / Sandboxing / Capabilities: * Only functions that take an `arena` parameter (usually the implicit `this` parameter) will be able to call system functions. * `arena`s will contain function pointers for all system functions, so they can be overridden (or blocked). * It will be possible to create more restricted `arena`s from a more powerful arena. * `arena`s will be unforgeable (on the language level). So, assuming that no external unsafe code is called, one will have unforgeable capabilities. * Modules *won't* have any initialization code, so it will be possible to set up sandboxing before calling into other modules. Type safety: * References can't be `none`, unless explicitly allowed with a `?ref` type. The compiler will refuse to compile code with potential dereferencing of `none` values. * No unsafe casts are ever allowed. `byte` to `int` is allowed, but not vice versa (but it will be allowed if there is a range check with `if` before). * Generic types are implemented safely and very efficiently using type erasure. This has relatively few downsides in a language like SLUL which lacks runtime type information. All values of a generic type are internally stored in a machine word; either a pointer to data or an integer that fits inside a data pointer. Platform-independent by default: * Code that compiles on one platform will compile on all other supported platforms (aside from memory limitations etc.), and will work the same way. (External non-SLUL libraries can obviously have platform-specific behavior.) * The runtime library will hide most platform differences. Pure by default: * Functions are pure by default, unless a parameter is of `ref var` type. * There is no global mutable data. * Thread-shared mutable data must be marked with the `threaded` type qualifier (even if read-only). No loopholes/`unsafe`: * SLUL will not have any (intentional) loopholes in the type system. No `unsafe` keyword. * Low-level libraries that access hardware or the kernel (such as a core runtime library) may need to be written in another language. (This is obviously only safe if the external code is written in a safe way) Lightweightness (The "L" in SLUL) --------------------------------- Low complexity/bloat: * No (recoverable) exceptions * No runtime type information * No static initializers * The runtime will be small Memory allocation: * Safe arena allocation with bump allocators, instead of micro-management (malloc/free) or garbage collection. - Per-thread/per-arena manually-invoked garbage collection *might* be added in the future. * No additional function parameters for allocators. Each pointer has an arena associated to it (this will work by assigning fixed size memory chunks to each arena, and storing arena information in the lowest addresses). Language: * The language is meant to be somewhat easy for someone else to re-implement. Most complexity will be in the semantic analysis. * Parsing is mostly LL(k) (uses no or constant look-ahead), except for expression parsing, which uses the [shunting yard algorithm](https://en.wikipedia.org/wiki/Shunting_yard_algorithm). * Semantic analysis should only require basic linear-time (or `n log(n)`) algorithms. * The runtime and the language/compiler will be possible to decouple. Fast compilation: * Clear separation of interface and implementation. To compile a program, the compiler ONLY needs the interface files (main.slul) of the direct dependencies. Plus the source code of the module being compiled, obviously ;) * No support for compile-time instantiation (templates/traits) means faster compilation and smaller binaries. * Fast non-optimizing compiler. It will generate machine code directly, without having to go through an assembler and linker. * An optimizing compiler backend (using LLVM or gccjit) might be added at a later point. Usable (The "U" in SLUL) ------------------------ Syntax: * The syntax is mainly inspired by C, and also Pascal to some degree. Punctuation (like semicolons and parentheses) have been removed where it was possible. * Self-synchronizing on error. Syntax errors do no continue across functions etc. Semantics: * Many complex features are intentionally left out. For example, closures and templates. * No surprises. All operations are explicit. No implicit return values, no implicit copying, no implicit delete/drop, etc. This means that SLUL code will sometimes be slightly longer compared to other languages such as C++ or Rust. Memory safety: * While SLUL does not use automatic garbage collection, it does use arena allocation to avoid micro-management of memory. Many programs are only going to need a couple explicit allocations/deallocations (or even none). This can be compared with C or Rust, where allocated objects are usually managed one by one. Build system: * Built-in build system that supports simple single-module projects. * At compile-time, only the *interfaces* of *direct dependencies* are necessary. (At run time, the dynamic library files of *all* dependencies are obviously needed.) * To target an old version of a library, you do NOT need to install the old version. Just declare the desired version in the `\depends` line for the module. * Trivial cross-compilation (`--target` option to the compiler). No need for sysroots or similar. * No need to install a development toolchain. SLUL will be fully self-contained. All that will be needed is an editor and a terminal. Related programming languages ----------------------------- * C, Pascal and Java were the main inspirations for this language. (But SLUL is very different from all of them). * [Vale](https://vale.dev) seems to be somewhat similar, with arena allocation support and safety/isolation as goals. * Other languages with one or more common goals: * [Austral](https://austral-lang.org) - [Austral Language GitHub](https://github.com/austral/austral/) * [C2 Language](http://www.c2lang.org) - [C2 Language GitHub](https://github.com/c2lang/c2compiler) * [C3 Language](https://github.com/c3lang/c3c) * [Cone Language](https://github.com/jondgoodwin/cone) * [Core Language](https://core-lang.dev) * [Hascal](https://hascal.github.io) * [Nim](https://www.nim-lang.org) * [Rust](http://www.rust-lang.org) * [Val](https://val-lang.dev/) * [Zig](https://ziglang.org/) * [Zimbu](http://www.zimbu.org) - [Zimbu Language specification](https://moolenaar.net/z/zimbu/spec/zimbu.html) * See also this nice list of new programming languages: [ProgLangDesign.net](https://proglangdesign.net) Thanks to --------- * ["MaskRay" / Ray Song](https://maskray.me) for great blog posts with details about the ELF file format. * Edsger Dijkstra for his ["shunting yard" algorithm](https://en.wikipedia.org/wiki/Shunting_yard_algorithm). * [Codeberg](https://codeberg.org) for their GIT repository hosting. * All the authors of the software that was used for development of SLUL: Linux, Debian, GCC, Clang, TCC, Glibc, Musl, Valgrind, WINE, GNU Make, BSD Make, Git, Nano, and many more. Status ------ A lot remains to be done! The first milestone is to have a PoC (proof of concept) for GNU/Linux. * **Mostly done:** * Parsing * Error reporting * Compiler CLI * ELF file output: Object files (.o), dynamic libraries (.so) and executable files * **In progress:** * IR generation * Semantic checking * Module system * arm64 code generation * **Not started:** * Memory/capability management (arenas) * Runtime * Standard library * x86 code generation (32 and 64 bit) * PE file output (WINE/Windows .exe) * Line/debugging information (DWARF) * Select/conditional expression * Variant/safe union/sum type * ELF symbol versioning. This is needed for resolving naming conflicts. * **Wanted, but not required for 1.0:** * More platforms: FreeBSD, AOSP/Android, Darling/MacOS? * More target CPU's? * WASM code generation? (without a relooper)