GC without limited overhead for metadata ---------------------------------------- For each type with outgoing references (including opaque types that may or may not have outgoing refs), have a "scan-function" that builds a structure: ( size in bytes, array of (pointer to data, pointer to scan-function) for outgoing refs ) This "scan-function" needs to be included in interfaces as well (i.e. it needs to be part of the vtable) The stack needs either: 1. fat pointers, perhaps in a special area, in order to know the "scan-function" for each root reference. 2. information about the stack frame Option 2 is probably better. The overhead is: - metadata for stack frames (basically, "debug info"). - pointers to the scan function in interfaces - scan functions (in code segment) Generational GC without tracking reference changes -------------------------------------------------- Could count the amount/number/out-refs-count/in-refs-count of each type on each full scan. And then activate "tracking" of allocations on specific types, based on which types can point to what other types. This will probably only work will if generic parameters can be taken into account (otherwise generic types can point to anything). If types {c,d,e} can only be referenced by {a,b}, then {a,b,c,d,e} can be tracked. And then all {a,b} are scanned, and the unreachable {c,d,e} objects can be queued for deletion. Other possible GC optimizations ------------------------------- * Don't scan immutable objects (or arena chunks?) over and over again. * Could do GC during blocking I/O calls. * Can dirtyness/accessedness of pages be queried from the OS? - would require stopping all threads, so maybe not any real optimization? - maybe the check could be done AFTER having done the scanning? * Can the pages be protected/unmapped during GC, and then userfaultfd be used to handle pagefaults? Calling vtable functions vs having RTTI as data ----------------------------------------------- Calling a vtable function for interfaces could be a bottleneck if there are lots of objects that are reached from interfaced-typed references. Maybe RTTI refs could be 1) compact and 2) be de-duplicated? E.g. a 24-bit type + 8-bit repeat count. The RTTI pointer is "only" needed when the reference is a non-concrete type. ...but the types of the references can't be known when the object is allocated, so in practice, the RTTI pointer always has to be present if the type implements at least one interface. Return value separate-allocation optimization? ---------------------------------------------- Consider the following: func f code Obj obj = new obj.add_stuff stuff set_value obj.get_value end Here, `obj` is short-lived, but the value `obj.get_value` is longer-lived. Can the return value of `obj.get_value` be allocated in the normal arena, while `obj` itself and any objects allocated in `obj.add_stuff` could go into a temporary (LIFO?) arena? * Perhaps obj could be provided with two arena? But then they would have to be stored in the object. * Perhaps thread-local-storage could be used to store the current "normal" as well as "temporary" arena? And the temporary arena could then be changed during the execution of `f`. But... What if obj.get_value returns obj itself??