1 files changed, 142 insertions, 0 deletions
diff --git a/notes/arena_io_non_io.txt b/notes/arena_io_non_io.txt
new file mode 100644
index 0000000..6064b98
--- /dev/null
+++ b/notes/arena_io_non_io.txt
@@ -0,0 +1,142 @@
+arena pointers / io vs non-io
+=============================
+
+Could require all "io-capable" objects to have a arena information at
+the beginning of the page (i.e. the address with the low order bits
+set to zero).
+
+This way, constant data can still work as usual without special alignment
+or ordering requirements (e.g. having .rodata before .text in the same page,
+which might not even be possible).
+
+It also allows large data to span across a page boundary, as long as any
+"io-capable" objects are not allocated in a such page (since they need
+the arena-chunk header).
+
+Contents of the arena-chunk
+---------------------------
+
+* Pointer to arena base.
+* Size used in chunk (= offset to next free byte)
+
+If the arena-base is always allocated at a fixed offset (e.g. as the very
+first allocation), then the pointer and the size can be merged into a single
+pointer.
+
+
+Contents of the arena base
+--------------------------
+
+This can be allocated inside the first arena-chunk.
+
+* Capabilities
+* IO-function pointer overrides
+* Chunked-list of pointers to the arena-chunks.
+  (constant-sized block, but can continue in increasing block sizes
+   as normal allocations)
+
+How / where to store information for garbage collection?
+--------------------------------------------------------
+
+Some form of GC would be useful. I don't think it has to be very
+fine-grained, though. It also doesn't have to free everything, but it
+would be good to have some kind of guarantee that it frees e.g. at least
+say 80% of the completely unused "allocation units".
+
+However, "allocation unit" should probably be smaller than the arena-chunk,
+which can be quite large, in particular if the page size is large (like
+64 KiB or even 2 MiB).
+
+It has to be somewhat resistent against attacks. Putting random data that
+happens to resemble pointers MUST NOT affect memory usage, for two reasons:
+1) it could work as a Denial-of-Service attack, and 2) it could cause the
+heap memory base address to be leaked.
+
+The really tricky parts:
+
+* Can it be done without lots of overhead for meta-data / RTTI?
+    - Maybe just a bitmap with 1 bit per pointer-sized word.
+      That's 128 bytes per 4K page on x86, and 64 bytes on x86_64.
+    - It could also be an area that grows down from the top
+      (e.g. run-length encoded). But the overhead might make it
+      less efficient than a bitmap.
+    - Just tracking the pointers is not enough to free parts of a
+      chunk, nor to do copy collection.
+        - That would require knowing the start (or size) of each
+          object.
+        - Could have 2 bits (is-start, contains-pointers)
+        - Or could store only sizes.
+        - Maybe it doesn't need object granularity?
+            - Could have an "end-bit" per e.g. pointer-sized word.
+              It would indicate if at least one object ends inside the word.
+              Granularity -> Size usage -> Percentage of 4K page on 32-bit:
+                8 bytes     128+128 = 256       6.25%
+               64 bytes     128+8   = 136       3.32%
+            - With a pointer-bitmap + a 1 byte size (16 byte granularity),
+              assuming 20 byte allocations on average:
+               16 bytes     128+204 = 332       8.11%
+              assuming 60 byte allocations on average:
+               16 bytes     128+68  = 196       4.79%
+* With non-io chunks, the meta-data may cover a larger (combined) chunk, and
+  hence the meta-data part will be larger, too.
+* Is limited granularity a problem when detecting live objects.
+    - Yes, it will be less accurate! Objects within the same "object-block"
+      that have a higher address will be incorrectly assumed to be in use.
+* Can GC'ed objects safely be read by C code?
+* Can GC'ed objects safely be modified by C code?
+* Is any temporary space required to perform GC?
+    - Needs to mark used object-blocks.
+    - Needs to queue objects to scan?
+
+Weird ideas:
+
+* Keep an allocation journal (with object sizes and locations of pointers)
+    - Has to be very fast to append to.
+    - Has to be very compact.
+    - Even with only 1 byte used:
+        with avg alloc size of 20 bytes => 5.00% overhead
+        with avg alloc size of 60 bytes => 1.67% overhead        
+* Extremely coarse-grained GC:
+    - Just have a bitmap of pointers, don't bother about sizes.
+        Size: pagesize/ptralign/8
+        Overhead: 1.56% (64b) - 3.13% (32b)
+            (pagesize/ptralign/8)/pagesize = 1/(ptralign*8)
+        Possibly no overhead with CHERI
+        (because it already tracks this info)
+    - Track pointers that might point to another arena-chunk
+      (or even another arena).
+    - Garbage collect only entire arena-chunks.
+* Track only allocations in the "topmost" arena-chunk
+  (the one where new allocations go).
+    - Use a bitmaps for:
+        - words that contain a pointer
+        - allocation start status per word:
+            - 00 = inside allocation
+            - 01 = inside allocation, and is the last word of it
+            - 10 = start, hasn't escaped
+            - 11 = start, could have escaped
+        - 3 bits in total per word.
+        - On simple allocations w/o pointers, only a single bit has to be set
+        - Allocations with pointers will be word-aligned, so setting the
+          bitmap bits should be easy for them as well (on 64b platform it's
+          just a byte-aligned OR to the bitmap)
+    - Escaped blocks cannot be moved
+        - Therefore, trigger GC before escaping a block if let's say
+           >= 50% of the arena-chunk has been used and no GC has been
+           performed.
+        - Track free "holes", so escaped blocks can be moved there quickly
+          (by just copying them and updating the refs to them.
+           the refs update is just a linear scan).
+    - Set a "might need GC" flag when a pointer (that at least before pointed
+      to the arena-chunk) is somehow changed.
+        - Just adding new pointers cannot trigger GC
+        - Note that the pointers can come from stack or other chunks.
+    - Scan all words marked as containing a pointer
+        - Can use the next unallocated page as a temporary storage
+            - Will be all 0 at start
+            - Needs to be all 0 at end
+        - Is it possible to use dynamic programming can be used to scan
+          linearly a few times?
+        - If not, simply queue the roots and start scanning from those.
+    - Unmarked words are then free.
+    - Re-pack the movable words to use space more efficiently.