1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
|
stdlib for the bootstrap compiler
=================================
What should go into the bootstrap stdlib, and how should the APIs look like?
Areas that would be necessary:
* Data structures:
- Strings
- Lists
- Maps
* File IO:
- Files (corresponding to a file descriptor)
- Separate buffers or buffer functionality embedded into file objects?
* CLI argument parsing
- And help texts etc.
- Environment variables
- A variable might be assignable via both env-var and CLI-argument
- Could have a `envVar` parameter to the CLI parameter definition
* Error output
* Localization
- Would just be a no-op in the bootstrap compiler, but the API
has to be designed.
* Arena management
- Only stubs needed in bootstrap compiler:
- For memory management
- For sandboxing
File I/O
--------
* InputFile, OutputFile, RWFile, AppendFile, SeekableInputFile, ...?
(Or just one File type with multiple constructors, and functions
returning I/O error if the operation is not allowed? Or use typestates
to distinguish between them? Or subtypes, or interfaces?)
- These should map to a file descriptor
- The fd could be encoded into a pointer (but with some offset
to allow for `none` values). This should work even with generic
`slot` types, since integers will be supported there.
- Maybe typestates could be used to allow only one buffer?
Not all OS'es might have a pwrite, and some types of files
don't support that anyway (non-seekable files such as pipes).
- Disallow mulitple command line arguments of different conflicting
types (e.g. OutputFile and InputFile) to point to the same filename
(perhaps compare inode on *nix-like systems).
* InputStream, OutputStream, RWStream
- These should have buffering
Output files from CLI are tricky, because we don't want to create (or worse,
truncate) files if an earlier stage fails.
* Some OS'es might have some kind of "filename reference"?
* In Linux, there's O_PATH, but does that do what is wanted?
- It seems that `openat`/`openat2` only support directories in the `fd`
parameter, so no.
* In Linux, there's name_to_handle_at, but it users an unprotected userspace
buffer (not suitable for sandboxing) and doesn't support all filesystems
anyway.
* Opening with O_CREAT|O_NOATIME|O_NOCTTY and without O_TRUNC, and performing
truncation lazily almost works. But then the file needs to be deleted
if the file never gets created on the application level (including
on signals).
Alternative solution:
* Have `File`s correspond to a filename (like in Java)
* For `File`s opened from the command line, store the filename in a
memory area that gets `mprotect`ed as read-only.
* Using `seccomp` on Linux, allow only the read-only arguments to be used
in calls to `open()`.
- Or `unveil` in *BSD.
* Files could also be blocked (with some reference counting system
perhaps? or allow only one `File` to point to a specific file?
but how should that work with symlinks?)
|