Garbage collection
==================

There are three types of memory leaks with arenas:
1. Unused tail parts of blocks.
2. Items that are no longer reachable.
3. Items that can't possibly be accessed in any reachable code path.
   (I don't think this is very well researched, so I will ignore this type)

There are at least 4 alternative solutions for handling leak types 1 & 2:

* Do nothing
* Put tail parts in a free-list and try to re-use whenever an arena is full.
* Conservative garbage collection
    - Needs to know the size of each alloc'n. This info can be packed/batched
      into words (e.g. 5*12 <= 64 for 4096 byte blocks on 64 bit platforms)
    - Does not solve the tail-parts problem
    - Can be DoS'ed by putting valid memory addresses in integers.
    - Can be used to leak memory addresses by adding lots of integers
      and measuring the memory usage after garbage collection.
* Compacting garbage collection
    - Needs to know the size of each allocation.
    - Needs to know where pointers in each allocation are.
        - Can store type info
        - Can store a bitmask.
        - Combining these two, for 4096 byte arenas:
            12 bits size + 1 bit mode + 51 bits type type info / bitmask
            51 bits = 408 bytes max.
        - I think that 408 bytes large objects WITH pointers are fairly
          uncommon, so perhaps we could use only bitmasks:
            12 bits size + 51 bits bitmask + 1 bit more-bitmasks-present
            optional: 63 bits bitmask + 1 bit more-bitmasks-present (up to 8)
        - Put this info before each allocataion? Or last?
          Some platforms will require alignment.
        - Apparently, OpenJDK puts all outgoing pointers first,
          so they just need a "number of outgoing pointers" field.
          (This would not work in SLUL, which needs structs with stable
           ordering of fields)
* Root-type-aware garbage collection
    - SLUL does not have variant types, so in theory, it could be possible
      to track only the types of the "roots" (i.e. references from local
      variables/function parameters).
    - Maybe this could use debugging information somehow?
      Or perhaps store the info in a special-purpose section?
        - It can be stored as code or data.
        - With code, it can trivially call code across libraries.
          But there is a performance penalty per call.
        - With data, the runtime could list all loaded/mapped modules,
          and then scan the ELF/PE header of each of them to find the
          GC meta-data. The runtime would have to bind all cross-module
          references in the meta-data. The initialization has to be done
          only once per process (or again if modules are loaded or unloaded
          at runtime).
    - But there are "slot" types / type parameters!
      I *think* that those could be determined from the stack trace,
      at least somewhere the type parameters <T,U,..> will have known
      values if you go far towards main() enough.
    - With known types, the size as well as the offsets to outbound
      references become known. (Plus any nested structs/arrays)
    - The GC could be enabled per-thread / per-arena.
        - This way, the garbage collection only needs to stop one thread
        - And "garbage" can be isolated to requests/users/etc. to give
          some form of resource control, so one user cannot slow down
          garbage collection for others.
        - It would have to track which arenas can have outbound/inbound
          references to each other.

Object categories
-----------------

It might be useful to categorize allocations as follows:

- has/lacks outgoing pointers.
- has/lacks outgoing mutable pointers.
- might have/lacks incoming interior pointers.
- might have/lacks incoming pointers from the heap.

Compacting GC for arenas
------------------------

Where to put the meta-data?
* in-line, before/after each allocation?
  how should this interact with alignment?
* in a separate memory area? (bump-allocated?)

Metadata bytes:

first byte 0-127:   pointer-free short block
first byte 128-255: other kind of block (bitmask):
    bit 6   -   contains pointers.
    bit 0-5 -   length (63 = long length)
long length
pointer locations (bitmask? on/off run-length encoding?)

Or perhaps store pointers to type information?