notes/libc_crt0_rtld.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274


How to handle the libc/crt0/rtld problems
=========================================

Want:

* Cross-compilation without per-target sysroot or per-target toolchain
* Native or ahead-of-time compiled binaries (no JIT)
* Something that can possibly call native code
  (but this can require some extra steps)

Solution 1: slulrtld without crt0, but convertible to local rtld+crt0
---------------------------------------------------------------------

Use a custom PT_INTERP (rtld), and no startup code (the rtld can take care
of that).

For cases where an rtld is really needed, the *binary* can be transformed to
match a "template binary".

1. Copy PT_INTERP.
2. Move the code to leave room for crt0.
3. Update symbols etc. to point to the updated code.
4. Insert all segments/sections needed for crt0, but strip out any non-crt0
   stuff.
5. Set the entry point to point to the crt0
6. Update the jump to main in crt0
   (this could possibly be tricky without object code).


Solution 2: slulrtld without crt0, but "reverse-linkable" to object file(!)
---------------------------------------------------------------------------

Use a custom PT_INTERP (rtld), and no startup code (the rtld can take care
of that).

For cases where an rtld is really needed, the *binary* can be transformed to
back to an object file and then linked with the system linker!

Can this work (assumed compiled to PIE/PIC)?
* .text can be moved freely.
* .bss can be moved along with .text.
* .rodata can be moved along with .text. SLUL does not allow references
  or initialization inside .rodata.
* There's no initialized .data in SLUL.

This requires the following steps:

1. Add a SLUL `rtmain.o` under the symbol `main`.
2. Add the .text, .rodata and .bss from the binary.
   (Recreate the sections from the segments.)
3. Convert the dynsyms to symbols:
    - Executables: For the service-types/entries data structure
    - Libraries: For the symbols that the library exports
    - Both: For any imported symbols.

Solution 3: slulrtld without crt0, with an option to emit object code instead
-----------------------------------------------------------------------------

Use a custom PT_INTERP (rtld), and no startup code (the rtld can take care
of that).

For cases where an rtld is really needed, an object file can be generated
instead, which can be linked with the system linker.

Solution 4: slulrtld without crt0, but provide a C loader
---------------------------------------------------------

The loader can be written in C, and that binary would be system specific.


Solution 5: Only emit libraries, and require explicit usage of a loader
-----------------------------------------------------------------------

The loader can be written in C, and that binary would be system specific.
The SLUL "program" (.so file) would only be CPU-architecture-specific.

This is a very simple, portable and standard solution. The downside is that
programs cannot be executed directly.

But, to make things easier for users:

* There could be a separate file extension, e.g. .x, .slulb, ...
* There could be some program header attribute in the binary that indicates
  that it is in fact a SLUL executable, and the minimum version of the
  run-time library.
* Having these two things, there could be `binfmt` support on Linux.

Binfmt format:

    :name:type:offset:mask:interpreter:flags
    types:
        E = file extension
        M = magic
    flags:
        P = preserve argv0
        O = pass fd instead of path (be careful to not expose unreadable files!)
        C = credentials (use setuid/setgid flags of target binary instead of interpreter. implies `O`).
            this seems risky if setuid binaries can be renamed to match the file extension!
            better use a magic value?
        F = fix binary (= don't load lazily)

Binfmt entry:

    :SLUL:E::slulb::/usr/bin/runslul:P
    :SLUL:M:\x7fELF....:\xff\xff\xff\xff...:/usr/bin/runslul:POC


Can the loader be written using the bootstrap subset of SLUL?

* It's a safe language so it typically cannot call mmap() etc
  (but `giveme`s could work around this, e.g. `UnrestrictedSystemAccess sys`)
* It needs to be able to handle handle some additional types:
    - pointers (but this can be an opaque type, e.g. `OpaqueSystemData`)
    - sizes (`size_t`)
    - file offsets (`off_t`)
* Maybe it is better to write at least large parts of it in C?


Solution 5+: Only emit "libraries", but provide tiny startup code to invoke the interpreter
-------------------------------------------------------------------------------------------

In this case, the startup code doesn't have to call into the libc or
anything like that. To the OS, the file is just a static binary. The
"rt0" code can then simply exec (without fork, with original arg0) the SLUL
rtld.

Note that binfmt can still be used as a fast-path (avoiding to map the target
binary twice).

Can this loader work without writing any memory? Tail-call into VDSO?
Pass argv and rtld path (string constant) to execve.
Note: Cannot use PATH here!

Even better, maybe the rtld can be given a fd, instead of a filename?
* could use getauxval(AT_EXECFD)
    - "GNU specific", but not really, there's always an aux-vector on Linux.
      (musl also has it since 1.1.0)
    - probably Linux-specific though
    - LD_SHOW_AUXV=1 sleep 1
        FD is absent if executed via a filename!
        then the filename is present in AT_EXECFN
* fexecve(fd,argv,env).
    - POSIX.1-2008
    - supports close-on-exec (as long as the interpreter is not a script)
* execveat(fd,"",argv,envp,AT_EMPTY_PATH)
    - this requires Linux kernel v. 3.19.
    - supports close-on-exec (as long as the interpreter is not a script)

How to communicate which fd is being used?
* Could use an environment variable, but want to avoid allocation etc.
* Could use a fixed fd (is this possible, maybe fd(3) can be closed?),
  but that might break some workflows? Also, the lower fd's might be closed,
  which could lead to an incorrect fd number (maybe loop until we get the
  right one?)

The RTLD needs to restore the process name with prctl(2) PR_SET_NAME.

Solution 6: Emit "#! binaries"
------------------------------

This will break debuggers, but it should otherwise work. It is very portable
across unix-like systems.

It also doesn't work with setuid/setgid (those bits are ignored).


Option 7: slulrtld without crt0, and load C code in separate processes
----------------------------------------------------------------------

This provides some safety also. It might actually be a really good option.
And many C libraries won't work with SLUL's sandboxing anyway.


Option 8: slulrtld without crt0, and load C code in a VM
---------------------------------------------------------

C libraries could be executed in a virtual machine.


Option 9: slulrtld without crt0, and simulate the kernel to load libc
---------------------------------------------------------------------

The startup process of the kernel could be virtualized/simulated, so the
libc (including its rtld) could be loaded into an existing SLUL process.

How to load the crt0.o code:
* The object file might be unavailable
* System binaries might be statically linked (or use the wrong libc)
* System binaries
* Could provide a special slul executable for this in e.g.
  /usr/libexec/slul/libc-template-executable
  that could be mapped into memory (including reusing the VDSO) and 

Simulating the kernel load:

* Can the ELF PHDR be used as-is?
* Can the aux-vector be used as-is? (Only if the ELF PHDR can be)
    - No, because I think it goes just below the stack?
    - AT_RANDOM should probably be updated IF slul also uses that
      OR if libc is instantiated multiple times (but that feels like a
      bad idea, if it's even possible).
* Does the existing PHDR or aux-vector have to be updated?
    - Maybe, for debuggers?
    - Check how dlopen/libdl.so works?
* Need to allocate a separate stack. The libc might do persistent
  allocations here, so it can't be free'd until the native libraries are
  unloaded.
* The crt0 will call a simulated `main()`
    - Will probably need relocation.
    - Will other parts of crt0 require allocation?
    - The simulated `main()` will need to do something like `longjmp()`
      out of itself and the crt0 startup code.
* There's probably no reason to simulate a return from `main()`. But if
  there is, it has to happen just before the process exits, since libc will
  terminate the process then.

But the libc might do bad stuff:

* Close file descriptors (but it really shouldn't do that)
* Exit the process

ALSO - IMPORTANT: This might trigger crashes/misbehavior in either libc
or in other libraries. Is there some way to indiciate in stack traces /
"debug dumps" that this is not a normal libc initialization, and that SLUL
could be to blame?
* Fake a .so mapping, e.g. __notice_libc_bogomapped_by_slul.so
* Fake a stacktrace entry, e.g. __notice__libc_bogomapped_by_slul
* PHDR entry?
* aux-vector entry?
* What terminology to use here?
    - bogusloaded
    - sneakyloaded
    - funnyloaded
    - bogomapped
    - funkymapped
    - wonkymapped
    - lazymapped
    - usermapped
    

Also, provide an environment variable, e.g. that forces eager loading of libc.
Naming?
* LD_SLUL_DISABLE_LIBC_BOGOLOADING
* LD_SLUL_EAGER_LOAD_LIBC
* LD_SLUL_SLOW_RELIABLE_LIBC_LOADING

Should LD_LIBRARY_PATH be taken into account when looking up the loaders?
In no event the LD_ env-vars should be used in setuid/setgid (AT_SECURE)
executables.

All in all, 3 different files are needed:
  /lib<suffix>/slulrtld-<arch>.so.1
  /usr/libexec/slul/libc-bogomapping-template-executable-<libc>-<arch>
  /usr/libexec/slul/slulrtld-using-libc-<libc>-<arch>
    (or maybe the last one should be called slul-pseudortld-libc-<libc>-<arch>
And a symlink:
  /usr/libexec/slul/slulrtld-using-libc


Things to check
---------------

ABI differences between ELF-based systems on the same CPU architecture?

Is it feasible to provide as good memcmp() and memset() as libc?
* But on the other hand, the compiler might want to inline such functions,
  and it that case it needs to have it's own (non libc) implementation.

Is it feasible to provide the syscall interfaces (that are used by SLUL)
in a way that is as good as libc?
* On Linux this means working around kernel bugs.
* On (some?) BSD's, it might require frequent updates to support
  new *BSD versions.