notes/map_data_structure.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35


goals:
- don't require an allocation resize operation.
- deterministic (either non-random algorithm, or deterministic randomness)
- no possibility of DoS despite random number generator being deterministic.

idea 1:
- a small hashmap -> int (index) to keep track of items matching to that hash
- an array of pointers -> hashmaps of increasing (but fixed) size

idea 2:
- use a skiplist?
- can we guarantee maximum time complexity?

idea 3:
- with max 10 000 items, can we do some optimizations?
- but time complexity will always be O(n log(n))

idea 4:
- have a shared "pool" for all hashmaps (per context for thread safety).
  this pool can have (hash ^ table-id) as keys.
  table-id could be a pointer (maybe(?) hashed/merged into 32 bits).
  advantages:
  - small hashmaps with few items don't have to use up space.
  - the pool can use large *reallocatable* blocks of 4 kB multiples
    (these can become arena blocks when free'd)
  disadvantages:
  - each allocation needs a void* for the table-id

idea 5:
- have an array (hash high order bits) -> bitmask (hash low order bits)
  for quick detection of non-existent items
- have an array (hash high order bits) -> (hash low order bits / pointer)
  as a LRU lookup cache, maybe with limited (say max 8 steps) linear probing.
- have some kind of tree for other lookups
  (can we start from the cached item?)