aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/map_data_structure2.txt
blob: 4ba5e567cce43bf505a3dd148d93b8e6223f6070 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

easiest is perhaps to have a tree structure, with a cache

other data structures:
* B-tree
* some B-tree like structure with linear probing in each block
  (and some special handling for nodes with the same hash)

simply use a "nested linear probing tree":
- 0 - 32 entries: 1 node only with 32 items, using linear probing.
    [0123456789ABCDEF]
- 33 - 64-d entries: 2 nodes:
    - root node with starting range for the sub node (1 sub node at start)
    - free space is used for direct node entries (low hashes)
    - second node contains only direct node entries (high hashes)
    ---> this means that resizing only needs to copy half of the
         entries to the new space (where they must be spread out / placed
         according to the hash)
    [01234567.......R] [8.9..A.BC.D..EF.]
    ---> but that would lead to an unbalanced array for the first node.
    [.0.1..23.4.5.67R] [8.9..A.BC.D..EF.]
- 65-d - 96-d entries: 3 nodes:
    - root node with end hash for 2nd node, start hash for 3rd node
    - free space is used for direct node entries (lowest hashes)
    - second node contains only direct node entries (highest hashes)
    - third node contains only direct node entries
      (high hashes from root node, low hashes from second node)
    ---> this minimized copying again.
    ---> no, again we need to re-balance the arrays:
    [.0..1...23..4.RR] [..5.6.78.9..A.B] [..BC.D...E.F.]
...
- 992-d (31*32) - 1024 entries: 33 nodes:
    - at this point we need multiple nodes with R entries.
    - rather than re-balancing the root and all blocks below,
      it might be better to re-balance the leaf nodes?