stdlib for the bootstrap compiler ================================= What should go into the bootstrap stdlib, and how should the APIs look like? Areas that would be necessary: * Data structures: - Strings - Lists - Maps * File IO: - Files (corresponding to a file descriptor) - Separate buffers or buffer functionality embedded into file objects? * CLI argument parsing - And help texts etc. - Environment variables - A variable might be assignable via both env-var and CLI-argument - Could have a `envVar` parameter to the CLI parameter definition * Error output * Localization - Would just be a no-op in the bootstrap compiler, but the API has to be designed. * Arena management - Only stubs needed in bootstrap compiler: - For memory management - For sandboxing File I/O -------- * InputFile, OutputFile, RWFile, AppendFile, SeekableInputFile, ...? (Or just one File type with multiple constructors, and functions returning I/O error if the operation is not allowed? Or use typestates to distinguish between them? Or subtypes, or interfaces?) - These should map to a file descriptor - The fd could be encoded into a pointer (but with some offset to allow for `none` values). This should work even with generic `slot` types, since integers will be supported there. - Maybe typestates could be used to allow only one buffer? Not all OS'es might have a pwrite, and some types of files don't support that anyway (non-seekable files such as pipes). - Disallow mulitple command line arguments of different conflicting types (e.g. OutputFile and InputFile) to point to the same filename (perhaps compare inode on *nix-like systems). * InputStream, OutputStream, RWStream - These should have buffering Output files from CLI are tricky, because we don't want to create (or worse, truncate) files if an earlier stage fails. * Some OS'es might have some kind of "filename reference"? * In Linux, there's O_PATH, but does that do what is wanted? - It seems that `openat`/`openat2` only support directories in the `fd` parameter, so no. * In Linux, there's name_to_handle_at, but it users an unprotected userspace buffer (not suitable for sandboxing) and doesn't support all filesystems anyway. * Opening with O_CREAT|O_NOATIME|O_NOCTTY and without O_TRUNC, and performing truncation lazily almost works. But then the file needs to be deleted if the file never gets created on the application level (including on signals). Alternative solution: * Have `File`s correspond to a filename (like in Java) * For `File`s opened from the command line, store the filename in a memory area that gets `mprotect`ed as read-only. * Using `seccomp` on Linux, allow only the read-only arguments to be used in calls to `open()`. - Or `unveil` in *BSD. * Files could also be blocked (with some reference counting system perhaps? or allow only one `File` to point to a specific file? but how should that work with symlinks?)