easiest is perhaps to have a tree structure, with a cache other data structures: * B-tree * some B-tree like structure with linear probing in each block (and some special handling for nodes with the same hash) simply use a "nested linear probing tree": - 0 - 32 entries: 1 node only with 32 items, using linear probing. [0123456789ABCDEF] - 33 - 64-d entries: 2 nodes: - root node with starting range for the sub node (1 sub node at start) - free space is used for direct node entries (low hashes) - second node contains only direct node entries (high hashes) ---> this means that resizing only needs to copy half of the entries to the new space (where they must be spread out / placed according to the hash) [01234567.......R] [8.9..A.BC.D..EF.] ---> but that would lead to an unbalanced array for the first node. [.0.1..23.4.5.67R] [8.9..A.BC.D..EF.] - 65-d - 96-d entries: 3 nodes: - root node with end hash for 2nd node, start hash for 3rd node - free space is used for direct node entries (lowest hashes) - second node contains only direct node entries (highest hashes) - third node contains only direct node entries (high hashes from root node, low hashes from second node) ---> this minimized copying again. ---> no, again we need to re-balance the arrays: [.0..1...23..4.RR] [..5.6.78.9..A.B] [..BC.D...E.F.] ... - 992-d (31*32) - 1024 entries: 33 nodes: - at this point we need multiple nodes with R entries. - rather than re-balancing the root and all blocks below, it might be better to re-balance the leaf nodes?