aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/libc_vs_custom_runtime.txt
blob: 3aa3c8287d7d1eef5da67e27ea93d7e6f72f1b90 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

libc/ld vs custom runtime/loader
================================

libc/ld pros
------------
* can interoperate with non-SLUL code
* can output C code to let a C compiler to the backend part
* able to target any CPU/OS that there is a C compiler for
  (but there can be limitations on supported host OS'es)
* no need for built-in "unsafe SLUL" (for the runtime/loader) with
  hexcode or assembler.
* gets basic debugging support "for free".

Custom runtime/loader pros
--------------------------
* easy cross-compilation "for free" without any additional effort
* does NOT need to bundle a C compiler on platforms that don't ship with one
  (e.g. Windows)
* only one target per CPU and executable format
  (no need to have separate targets for e.g.
   linux-2.6-libc-x.y, linux-2.6-musl, OpenBSD-X.Y targets)
* smaller amount of code that's needed for boostrapping / smaller TCB
* can generate good debugging information. (in C code it possible to override
  the source filename and line number in the debugging information, but nothing
  else)


A middle ground
---------------
* have a SLUL loader and runtime
* write the loader and runtime in C
  (can the loader be a dynamically linked C program?)

A dynamic solution, that supports both C and SLUL runtimes/loaders
------------------------------------------------------------------
* Use a custom PT_INTERP value, e.g. /lib/ldslul-x86_64.so
* Dynamically link to slulrt.so but do not reference any symbols.
* On systems that use a libc-based SLUL runtime and loader:
    * /lib/ldslul-x86_64.so is a symbolic link to the system's loader
      (e.g. /lib64/ld-linux-x86-64.so.2)
    * slulrt.so contains _start and main that load all SLUL libraries
      (and resolve APIs etc, including those in slulrt). This library
      is specific for the combination of C library, OS, and CPU architecture.
    * Once done, main in slulrt.so calls SLUL's main funcion (in the
      executable)
* On systems that can use a statically linked SLUL runtime and loader:
    * /lib/ldslul-x86_64.so is a program written in an unsafe variant of
      SLUL, that calls the kernel directly and loads slulrt.so
    * slulrt.so is written in unsafe "SLUL" and calls the kernel directly.
* Executables that require the C runtime should explicitly specify the system's
  C loader and slulrt-ARCH.so (where ARCH includes the identifier of the libc)
* On Windows, you MUST use the SLUL runtime
* On OpenBSD, you MUST use the C runtime

* Advantages:
    + Binaries can be bit for bit identical regardless of what runtime is
      being used on the system.
    + Can work either with or without libc
* Disadvantages
    - Not sure which OS:es support having _start and main in a library.
    - Tools that scan for dependencies will think that the executable
      needs slulrt.so only.
    - Running "standalone" in a directory (without a system installation
      of SLUL) requires that the loader is executed manually, e.g.
      with a script.

This seems to almost work with these flags to GCC:
    app: -nostdlib --entry=_hang -fPIC appwithout_start.c -L . -lwith_start
    library: -fPIC -shared-libgcc -no-pie -z now -Wl,-soname,libwith_start.so -Wl,-export-dynamic libwith_start.c -Wl,--no-dynamic-linker
Problems:
    - The library gets placed in some random location, but the DT_INITARRAY
      gets called with some seamingly static address instead of the one that
      the library has been loaded at.
        - Possibly due to missing relocations?
    - It seems to work *WITHOUT* the libc init code.
      In that code, _start in the *app* gets called (which is undesirable, but works).
      The _start function could save the stack pointer and call into some
      initialization function in libslulrt.so (which can call the real libc
      init functions also). Finally, libslulrt can call the real main
      function in the app.
    - BUT the regs (e.g. for passing information to the binary) might be different!
      Assuming that the regs are NOT used for dynamic calls (GOT/PLT, etc), then
      we can call into some initialization function in libslulrt.
A full analysis is needed to check which crt*.o objects need to be linked
in (if any) and what flags need to be given to the compiler/linker.


Alternative solution: Use #! lines
----------------------------------
* use a #! line
* that #! line can be followed by some ELF or non-ELF executable data.
* how does this work with memory mapping? maybe the loader can take care of all that?

    #!/lib/ld-slul-x86_64
    ...binary data...

* advantages:
    + very simple and not "locked" to the ELF format
* disadvantages:
    - can't use existing debuggers
    - has to reinvent the wheel for a lot of functionality

Alternative solution: Use C ABI for _start/main only
----------------------------------------------------
* Have two code segments (or have one and merge the info):
    - One for SLUL code, generated by the SLUL compiler
    - One with pre-generated C code, created with a C compiler for the
      system. (Maybe this could go into a separate library?)
* On current *NIX systems, the C entry point (_start) is used.
* But a separate SLUL loader can be used to load such binaries.
  This way, it is possible to load binaries built for another system.


How to allow selecting between multiple loaders and runtimes?
-------------------------------------------------------------
The loader and runtime may be coupled together, so a loader should only
be used with it's acompanying runtime (and vice versa).

This can be supported with symlinked directories:

    /lib/ldslul-x86_64.so               ->  /usr/lib/x86_64-linux-gnu/slulcore/ldslul.so
    /lib/x86_64-linux-gnu/libslulrt.so  ->  /usr/lib/x86_64-linux-gnu/slulcore/libslulrt.so

    /usr/lib/x86_64-linux-gnu/slulcore ->   slulcore-glibc  (or slulcore-musl, slulcore-linuxstatic)

ldslul.so must include a hard-coded path to the acompanying libslulrt.so file.

This setup is not only usefull for supporting both glibc and musl, but also
for allowing special-purpose or optimizing runtimes. For example:
- a special purpose debugging runtime that debug-logs all cross-module calls
  (or even allows them to be intercepted / mocked).
- a special purpose optimizing runtime that runs iterators in multiple
  threads.