aboutsummaryrefslogtreecommitdiffhomepage
path: root/README.md
blob: 52c34ed9cc3019bbec8fd035833e1233290a6aad (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273

SLUL - Safe Lightweight Usable Language
=======================================
This is a work in progress!

Goals
-----
* **Safe**: Memory safety. Capabilities. No loopholes or `unsafe`.
* **Lightweight**: Small and efficient native programs. Fast compilation.
* **Usable**: Make it easy to read, learn and use.
* **Modularity first**: Enforced API compatibility.
  Only *interfaces* of *direct* dependencies needed for compilation.
* **No Surprises**: Make actions and semantics explicit.

And of course combinations of these. E.g. memory safety should still be
guaranteed even if dependencies are upgraded to newer versions.

Target audience
---------------
Free/Open Source software (FOSS) developers who want a
**safe** and **simple** programming language that works
with **low-end** devices (e.g. single-board computers, old hardware, etc.).

Sample code
-----------

Not working yet!

    \slul 0.0.0
    \depends slulrt 0.0.0
    \depends thing 0.2

    func arena SlulApp.main() -> SlulExitStatus {
        ref var Thing t = .new(this)
        if t.get_color() == .red {
            t.paint(.color = .green)
        }
        return .success
    }

Language taxonomy
-----------------
Note: The language is very far from finished, and there is not even a
working compiler!

* **Static Type System**. Without implicit conversions of values. Nominal typing.
* **Procedural Language**. With modules and limited Object Orientation; data types can have constructors and methods. (Some kind of interface type might be added also.)
* **Immutable by default**. No mutable global state. Functions can only access their parameters, which are constant by default.
* **No Micro Management**. Automatic dereferencing. Simple memory management through usage of `arena`s. Freeing an arena free's all of it's memory and resources.
* **Explicit Resource Management**. Coarse-grained, safe, management of resources using arenas. No implicit dropping of resources. No garbage collection or refcounting.
* **Sandboxing using Capabilities**. Arenas will serve as sandboxing/capabilities systems as well. Code will only be able to access the system via arenas.
* **Isolated Non-Recoverable Exceptions**. An exception only affects reachable arenas in the call-chain. Those become invalid and get free'd. Isolation of call-chains will be possible.
* **Enforced Module Compatibility**. Strict API/ABI Compatibility is compiler-enforced, via hash values of interface versions.
* **C-compatible ABI**. SLUL code can call and be called by C code, but additional code needs to be written to check capabilities (or set up `seccomp` or similar).
* **Simple Syntax**. Reduced punctuation, verbose semantics, standard {} blocks.
* **Value Inference**. `uint64 v = 123` instead of `let v = 123u64`.
* **Namespace Inference**. `Color c = .rgb(255,0,0)` instead of `let c = Color(255,0,0)`.
* **Generic Typing**. Restricted to references and word-sized objects. Type erasure is used, so generic types support dynamic linking and do not cause additional machine code to be generated.
* **Restricted Null**. `ref` cannot be `none`, while `?ref` can.
* **No Significant Whitespace**, but incorrect indentation can generate warnings. Line breaks are significant, in some places, though.

Safety features (The "S" in SLUL)
---------------------------------
**NOTE: Most things in this section are not yet checked.**

Memory safety will be enforced by the compiler:

* `ref`/`arena`/`own` keywords are used to specify basic lifetimes.
  And most code will be able to use arenas to avoid complex
  micro-management of memory (which could make SLUL faster than C
  in some cases).
* The `lifetime` keyword can be used to specify more complex lifetimes
  of function parameters, such as `lifetime a >= return`.
* Unintialized data will not be allowed to be used.

API stability will be enforced by the compiler (and loader):

* Module versions are hashed, which prevents accidental modification.
* Hashes will be included in libraries (as one symbol per version), and will be
  linked against, to be able to detect if the wrong library version is
  installed.
* Module dependencies can safely depend on different (stable) versions of
  the same module, without any issues.

Principle of least privilege / Sandboxing / Capabilities:

* Only functions that take an `arena` parameter (usually the implicit
  `this` parameter) will be able to call system functions.
* `arena`s will contain function pointers for all system functions,
  so they can be overridden (or blocked).
* It will be possible to create more restricted `arena`s from a
  more powerful arena.
* `arena`s will be unforgeable (on the language level). So, assuming that
  no external unsafe code is called, one will have unforgeable capabilities.
* Modules *won't* have any initialization code, so it will be possible
  to set up sandboxing before calling into other modules.

Type safety:

* References can't be `none`, unless explicitly allowed with a `?ref` type.
  The compiler will refuse to compile code with potential dereferencing of
  `none` values.
* No unsafe casts are ever allowed. `byte` to `int` is allowed, but not
  vice versa (but it will be allowed if there is a range check with `if`
  before).
* Generic types are implemented safely and very efficiently using type
  erasure. This has relatively few downsides in a language like SLUL which
  lacks runtime type information. All values of a generic type are internally
  stored in a machine word; either a pointer to data or an integer that fits
  inside a data pointer.

Platform-independent by default:

* Code that compiles on one platform will compile on all other
  supported platforms (aside from memory limitations etc.),
  and will work the same way. (External non-SLUL libraries can obviously
  have platform-specific behavior.)
* The runtime library will hide most platform differences.

Pure by default:

* Functions are pure by default, unless a parameter is of `ref var` type.
* There is no global mutable data.
* Thread-shared mutable data must be marked with the `threaded` type
  qualifier (even if read-only).

No loopholes/`unsafe`:

* SLUL will not have any (intentional) loopholes in the type system.
  No `unsafe` keyword.
* Low-level libraries that access hardware or the kernel (such as a core
  runtime library) may need to be written in another language.
  (This is obviously only safe if the external code is written in a safe way)

Lightweightness (The "L" in SLUL)
---------------------------------
Low complexity/bloat:

* No (recoverable) exceptions
* No runtime type information
* No static initializers
* The runtime will be small

Memory allocation:

* Safe arena allocation with bump allocators, instead of micro-management
  (malloc/free) or garbage collection.
    - Per-thread/per-arena manually-invoked garbage collection *might* be
      added in the future.
* No additional function parameters for allocators. Each pointer
  has an arena associated to it (this will work by assigning fixed size
  memory chunks to each arena, and storing arena information in the lowest
  addresses).

Language:

* The language is meant to be somewhat easy for someone else to re-implement.
  Most complexity will be in the semantic analysis.
* Parsing is mostly LL(k) (uses no or constant look-ahead), except for
  expression parsing, which uses the
  [shunting yard algorithm](https://en.wikipedia.org/wiki/Shunting_yard_algorithm).
* Semantic analysis should only require basic linear-time (or `n log(n)`)
  algorithms.
* The runtime and the language/compiler will be possible to decouple.

Fast compilation:

* Clear separation of interface and implementation. To compile a
  program, the compiler ONLY needs the interface files (main.slul)
  of the direct dependencies. Plus the source code of the
  module being compiled, obviously ;)
* No support for compile-time instantiation (templates/traits) means
  faster compilation and smaller binaries.
* Fast non-optimizing compiler. It will generate machine code directly,
  without having to go through an assembler and linker.
* An optimizing compiler backend (using LLVM or gccjit) might be added
  at a later point.

Usable (The "U" in SLUL)
------------------------
Syntax:

* The syntax is mainly inspired by C, and also Pascal to some degree.
  Punctuation (like semicolons and parentheses) have been removed where
  it was possible.
* Self-synchronizing on error. Syntax errors do no continue across functions etc.

Semantics:

* Many complex features are intentionally left out. For example,
  closures and templates.
* No surprises. All operations are explicit. No implicit return values,
  no implicit copying, no implicit delete/drop, etc. This means that
  SLUL code will sometimes be slightly longer compared to other languages
  such as C++ or Rust.

Memory safety:

* While SLUL does not use automatic garbage collection, it does use arena
  allocation to avoid micro-management of memory. Many programs
  are only going to need a couple explicit allocations/deallocations
  (or even none). This can be compared with C or Rust, where allocated
  objects are usually managed one by one.

Build system:

* Built-in build system that supports simple single-module projects.
* At compile-time, only the *interfaces* of *direct dependencies*
  are necessary. (At run time, the dynamic library files of *all*
  dependencies are obviously needed.)
* To target an old version of a library, you do NOT need to install
  the old version. Just declare the desired version in the `\depends` line
  for the module.
* Trivial cross-compilation (`--target` option to the compiler).
  No need for sysroots or similar.
* No need to install a development toolchain. SLUL will be fully
  self-contained. All that will be needed is an editor and a terminal.

Related programming languages
-----------------------------
* C, Pascal and Java were the main inspirations for this language.
  (But SLUL is very different from all of them).
* [Vale](https://vale.dev) seems to be somewhat similar, with arena allocation support and
  safety/isolation as goals.
* Other languages with one or more common goals:
    * [Austral](https://austral-lang.org) - [Austral Language GitHub](https://github.com/austral/austral/)
    * [C2 Language](http://www.c2lang.org) - [C2 Language GitHub](https://github.com/c2lang/c2compiler)
    * [C3 Language](https://github.com/c3lang/c3c)
    * [Cone Language](https://github.com/jondgoodwin/cone)
    * [Core Language](https://core-lang.dev)
    * [Hascal](https://hascal.github.io)
    * [Nim](https://www.nim-lang.org)
    * [Rust](http://www.rust-lang.org)
    * [Val](https://val-lang.dev/)
    * [Zig](https://ziglang.org/)
    * [Zimbu](http://www.zimbu.org) - [Zimbu Language specification](https://moolenaar.net/z/zimbu/spec/zimbu.html)
* See also this nice list of new programming languages: [ProgLangDesign.net](https://proglangdesign.net)

Thanks to
---------
* ["MaskRay" / Ray Song](https://maskray.me) for great blog posts with details about the ELF file format.
* Edsger Dijkstra for his ["shunting yard" algorithm](https://en.wikipedia.org/wiki/Shunting_yard_algorithm).
* [Codeberg](https://codeberg.org) for their GIT repository hosting.
* All the authors of the software that was used for development of SLUL:
  Linux, Debian, GCC, Clang, TCC, Glibc, Musl, Valgrind, WINE, GNU Make, BSD Make, Git, Nano, and many more.

Status
------
A lot remains to be done! The first milestone is to have a PoC (proof of concept) for GNU/Linux.

* **Mostly done:**
    * Parsing
    * Error reporting
    * Compiler CLI
    * ELF file output: Object files (.o), dynamic libraries (.so) and executable files
* **In progress:**
    * IR generation
    * Semantic checking
    * Module system
    * arm64 code generation
* **Not started:**
    * Memory/capability management (arenas)
    * Runtime
    * Standard library
    * x86 code generation (32 and 64 bit)
    * PE file output (WINE/Windows .exe)
    * Line/debugging information (DWARF)
    * Select/conditional expression
    * Variant/safe union/sum type
    * ELF symbol versioning. This is needed for resolving naming conflicts.
* **Wanted, but not required for 1.0:**
    * More platforms: FreeBSD, AOSP/Android, Darling/MacOS?
    * More target CPU's?
    * WASM code generation? (without a relooper)