aboutsummaryrefslogtreecommitdiffhomepage

SLUL - Safe Lightweight Usable Language

This is a work in progress!

Goals

  • Safe: Memory safety. Capabilities. No loopholes or unsafe.
  • Lightweight: Small and efficient native programs. Fast compilation.
  • Usable: Make it easy to read, learn and use.
  • Modularity first: Enforced API compatibility. Only interfaces of direct dependencies needed for compilation.
  • No Surprises: Make actions and semantics explicit.

And of course combinations of these. E.g. memory safety should still be guaranteed even if dependencies are upgraded to newer versions.

Target audience

Free/Open Source software (FOSS) developers who want a safe and simple programming language that works with low-end devices (e.g. single-board computers, old hardware, etc.).

Sample code

Not working yet!

\slul 0.0.0
\depends slulrt 0.0.0
\depends thing 0.2

func arena SlulApp.main() -> SlulExitStatus {
    ref var Thing t = .new(this)
    if t.get_color() == .red {
        t.paint(.color = .green)
    }
    return .success
}

Language taxonomy

Note: The language is very far from finished, and there is not even a working compiler!

  • Static Type System. Without implicit conversions of values. Nominal typing.
  • Procedural Language. With modules and limited Object Orientation; data types can have constructors and methods. (Some kind of interface type might be added also.)
  • Immutable by default. No mutable global state. Functions can only access their parameters, which are constant by default.
  • No Micro Management. Automatic dereferencing. Simple memory management through usage of arenas. Freeing an arena free's all of it's memory and resources.
  • Explicit Resource Management. Coarse-grained, safe, management of resources using arenas. No implicit dropping of resources. No garbage collection or refcounting.
  • Sandboxing using Capabilities. Arenas will serve as sandboxing/capabilities systems as well. Code will only be able to access the system via arenas.
  • Isolated Non-Recoverable Exceptions. An exception only affects reachable arenas in the call-chain. Those become invalid and get free'd. Isolation of call-chains will be possible.
  • Enforced Module Compatibility. Strict API/ABI Compatibility is compiler-enforced, via hash values of interface versions.
  • C-compatible ABI. SLUL code can call and be called by C code, but additional code needs to be written to check capabilities (or set up seccomp or similar).
  • Simple Syntax. Reduced punctuation, verbose semantics, standard {} blocks.
  • Value Inference. uint64 v = 123 instead of let v = 123u64.
  • Namespace Inference. Color c = .rgb(255,0,0) instead of let c = Color(255,0,0).
  • Generic Typing. Restricted to references and word-sized objects. Type erasure is used, so generic types support dynamic linking and do not cause additional machine code to be generated.
  • Restricted Null. ref cannot be none, while ?ref can.
  • No Significant Whitespace, but incorrect indentation can generate warnings. Line breaks are significant, in some places, though.

Safety features (The "S" in SLUL)

NOTE: Most things in this section are not yet checked.

Memory safety will be enforced by the compiler:

  • ref/arena/own keywords are used to specify basic lifetimes. And most code will be able to use arenas to avoid complex micro-management of memory (which could make SLUL faster than C in some cases).
  • The lifetime keyword can be used to specify more complex lifetimes of function parameters, such as lifetime a >= return.
  • Unintialized data will not be allowed to be used.

API stability will be enforced by the compiler (and loader):

  • Module versions are hashed, which prevents accidental modification.
  • Hashes will be included in libraries (as one symbol per version), and will be linked against, to be able to detect if the wrong library version is installed.
  • Module dependencies can safely depend on different (stable) versions of the same module, without any issues.

Principle of least privilege / Sandboxing / Capabilities:

  • Only functions that take an arena parameter (usually the implicit this parameter) will be able to call system functions.
  • arenas will contain function pointers for all system functions, so they can be overridden (or blocked).
  • It will be possible to create more restricted arenas from a more powerful arena.
  • arenas will be unforgeable (on the language level). So, assuming that no external unsafe code is called, one will have unforgeable capabilities.
  • Modules won't have any initialization code, so it will be possible to set up sandboxing before calling into other modules.

Type safety:

  • References can't be none, unless explicitly allowed with a ?ref type. The compiler will refuse to compile code with potential dereferencing of none values.
  • No unsafe casts are ever allowed. byte to int is allowed, but not vice versa (but it will be allowed if there is a range check with if before).
  • Generic types are implemented safely and very efficiently using type erasure. This has relatively few downsides in a language like SLUL which lacks runtime type information. All values of a generic type are internally stored in a machine word; either a pointer to data or an integer that fits inside a data pointer.

Platform-independent by default:

  • Code that compiles on one platform will compile on all other supported platforms (aside from memory limitations etc.), and will work the same way. (External non-SLUL libraries can obviously have platform-specific behavior.)
  • The runtime library will hide most platform differences.

Pure by default:

  • Functions are pure by default, unless a parameter is of ref var type.
  • There is no global mutable data.
  • Thread-shared mutable data must be marked with the threaded type qualifier (even if read-only).

No loopholes/unsafe:

  • SLUL will not have any (intentional) loopholes in the type system. No unsafe keyword.
  • Low-level libraries that access hardware or the kernel (such as a core runtime library) may need to be written in another language. (This is obviously only safe if the external code is written in a safe way)

Lightweightness (The "L" in SLUL)

Low complexity/bloat:

  • No (recoverable) exceptions
  • No runtime type information
  • No static initializers
  • The runtime will be small

Memory allocation:

  • Safe arena allocation with bump allocators, instead of micro-management (malloc/free) or garbage collection.
    • Per-thread/per-arena manually-invoked garbage collection might be added in the future.
  • No additional function parameters for allocators. Each pointer has an arena associated to it (this will work by assigning fixed size memory chunks to each arena, and storing arena information in the lowest addresses).

Language:

  • The language is meant to be somewhat easy for someone else to re-implement. Most complexity will be in the semantic analysis.
  • Parsing is mostly LL(k) (uses no or constant look-ahead), except for expression parsing, which uses the shunting yard algorithm.
  • Semantic analysis should only require basic linear-time (or n log(n)) algorithms.
  • The runtime and the language/compiler will be possible to decouple.

Fast compilation:

  • Clear separation of interface and implementation. To compile a program, the compiler ONLY needs the interface files (main.slul) of the direct dependencies. Plus the source code of the module being compiled, obviously ;)
  • No support for compile-time instantiation (templates/traits) means faster compilation and smaller binaries.
  • Fast non-optimizing compiler. It will generate machine code directly, without having to go through an assembler and linker.
  • An optimizing compiler backend (using LLVM or gccjit) might be added at a later point.

Usable (The "U" in SLUL)

Syntax:

  • The syntax is mainly inspired by C, and also Pascal to some degree. Punctuation (like semicolons and parentheses) have been removed where it was possible.
  • Self-synchronizing on error. Syntax errors do no continue across functions etc.

Semantics:

  • Many complex features are intentionally left out. For example, closures and templates.
  • No surprises. All operations are explicit. No implicit return values, no implicit copying, no implicit delete/drop, etc. This means that SLUL code will sometimes be slightly longer compared to other languages such as C++ or Rust.

Memory safety:

  • While SLUL does not use automatic garbage collection, it does use arena allocation to avoid micro-management of memory. Many programs are only going to need a couple explicit allocations/deallocations (or even none). This can be compared with C or Rust, where allocated objects are usually managed one by one.

Build system:

  • Built-in build system that supports simple single-module projects.
  • At compile-time, only the interfaces of direct dependencies are necessary. (At run time, the dynamic library files of all dependencies are obviously needed.)
  • To target an old version of a library, you do NOT need to install the old version. Just declare the desired version in the \depends line for the module.
  • Trivial cross-compilation (--target option to the compiler). No need for sysroots or similar.
  • No need to install a development toolchain. SLUL will be fully self-contained. All that will be needed is an editor and a terminal.

Related programming languages

Thanks to

  • "MaskRay" / Ray Song for great blog posts with details about the ELF file format.
  • Edsger Dijkstra for his "shunting yard" algorithm.
  • Codeberg for their GIT repository hosting.
  • All the authors of the software that was used for development of SLUL: Linux, Debian, GCC, Clang, TCC, Glibc, Musl, Valgrind, WINE, GNU Make, BSD Make, Git, Nano, and many more.

Status

A lot remains to be done! The first milestone is to have a PoC (proof of concept) for GNU/Linux.

  • Mostly done:
    • Parsing
    • Error reporting
    • Compiler CLI
    • ELF file output: Object files (.o), dynamic libraries (.so) and executable files
  • In progress:
    • IR generation
    • Semantic checking
    • Module system
    • arm64 code generation
  • Not started:
    • Memory/capability management (arenas)
    • Runtime
    • Standard library
    • x86 code generation (32 and 64 bit)
    • PE file output (WINE/Windows .exe)
    • Line/debugging information (DWARF)
    • Select/conditional expression
    • Variant/safe union/sum type
    • ELF symbol versioning. This is needed for resolving naming conflicts.
  • Wanted, but not required for 1.0:
    • More platforms: FreeBSD, AOSP/Android, Darling/MacOS?
    • More target CPU's?
    • WASM code generation? (without a relooper)