Compact encoding of expressions =============================== First byte: 7 bits: token type 1 bit: long encoding Second byte: 6 bits: column advancement (if line adv != 0, then from start of line) 2 bits: line advancement Third-fourth byte: 16 bits: string table ID (different namespaces for: types/methods/fields/locals/constructors/typeidents/strings) - The IDs could be sorted such that the most common ones have low (16 bit) IDs. - This is a string, not a bound symbol, so the same ID might be used for different symbols/variables in different contexts! - With a 32-bit ID, it would be possible to have an offset into a string tables. BUT constructing such a string table (with immutable offsets) might be inefficient. Long encoding ------------- 7 bits: token type 1 bit: long encoding (= 1) 8 bits: column number (0 = too long) 16 bits: line number (0 = too long)