What items can be used in error messages AFTER parsing, i.e. which source locations do we need to track? (during parsing we know the current source location) These need to be tracked: * Expressions (including identifier references) * Declared identifiers * Control statements Alternatively, track the following - this is method 6: * Declarataions (types/data/functions) * Statements (these could then have relative line/column counting also) * RPN (also using relative counting) ----- How to store them? 1) Store directly in expression/declaration: uint32 line uint32 column const char *filename (96/128 bits = 12/16 bytes) 2) Store somewhat compressed, but still directly in expression/declaration: uint32 line uint16 column uint16 file_id (64 bits = 8 bytes) 3) Store line and column only, and detect file from arena uint32 line uint32 column (64 bits = 8 bytes) 4) Store an index to some array (separate arrays for file-local stuff and symbols visible outside of the file) uint32 index (32 bits = 4 bytes) array of bytes: 0xxx xxxx Increase line number 1xxx xxxx Increase column number 0000 0000 Change of file (following bytes are some kind of file identifier or pointer) Two repeated increases of the same type indicate that the following bits are high(er) order bits. (usually 8-16 bits = 1-2 bytes) Total: 5-6 bytes 5) Store subexprs as an array in variable-length RPN format and encode it as follows: (it would be nice if the array could be stored separately) First: 0ccc cccc signed change of column number 1ccc cccc llll llll signed change of column and line number 0000 0000 int16 column int16 line 1111 1111 int32 column int32 line (usually 8-16 bits = 1-2 bytes) Then the operator/terminal type follows. Declarations and top-level exprs still need full (absolute) line/column info. 6) Store an uint32/uint32 line/column for each declaration (should the filename be detected from the arena?) For each statement have an uint16 as follows: int8 start_line_increment (relative to previous statement) uint8 start_column If start_column is 0, read an (uint32,uint32) tuple after the statement* For each subexpr have an uint32 as follows: uint8 * This requires that we detect "end of arena block" when reading, and then continue from the next arena (which must then start with the tuple OR there must reserved space for a pointer at the end of the arena to the next location). *2 If we use the hack above, then we could as well skip pointers have store everything as a stream of bytes. *3 And in that case, we could dump those streams to disk as a target and host independent format, e.g: uint32 offset to implementations ...declarations... ...implementations...