Code Overview ============= Directory Overview ------------------ The source directories: src-backend - The backend part. Also called CSBE (CSlul BackEnd) |-- codegen - The code generators (e.g. aarch64) \-- outformat - The output formats (e.g. ELF) src-common - Common code for unit testing etc. src-cslul - The compiler frontend (parser/analyzer/ir-gen) and CLI. \-- winlibc - Minimal libc for WINE/Windows. src-runtime - The runtime library for SLUL programs. The directories with test SLUL source code: errortest - Tests of errornouos syntax. testexec - Test SLUL application and libraries. In addition, the source directories contain `unittest` sub-directories with a `test_*.c` corresponding to each `.c` source file. Miscellaneous directories: misc - Miscellaneous files |-- icons - Very ugly icons. Should be replaced/improved :) \-- syntax - Syntax highlighting definitions. notes - Various random notes. Some may be irrelevant/obsolete. or just some crazy ideas that will never be implemented. The backend - CSBE ------------------ CSBE is a minimalistic self-contained compiler backend. It has the following functionality: * Functions for constructing an Intermediate Representation (IR) * Machine code generation from the IR. Currently only aarch64. * Output file generation from the machine code. Currently only ELF. CSBE differs from other code generators (LLVM, QBE, ...) in these ways: * There is no/minimal optimization. The goal is simplicity. * The IR is *fully* architecture-neutral. * There is no need for a toolchain/linker. * There is no need for a sysroot. The IR has enough information to generate symbol imports, without any external information (i.e. libraries *don't* have to be installed at compile-time). * **It is in a very, very early development phase :)** Public header files (note that the API is **not** stable yet): include/csbe.h - Public functions used by the frontend. include/csbe_ops.h - Definitions of the IR operations. Internal header files: csbe_internal.h - Non-static internal functions and types. codegen/codegen_common.h - Helper functions for the codegen. outformat/outformat_common.h - Helper functions for ELF/PE. The compiler frontend --------------------- The compiler works in the following steps: 1. `main.c` parses the command line options, and initializes a compilation context object, `struct CSlul`. 2. The compilation stages are handled in `build.c`. 3. `mhtoken.c` / `mhparse.c` parse the "module header" lines in `main.slul`. 4. The main parsing (`token.c` / `parse.c`) is then done as follows: * If the module turns out to be an *application*, * The code in `main.slul` and any `\source` files are parsed into a AST (Abstract Syntax Tree). * If the module turns out to be a *library*: * The interface in `main.slul` is parsed to an AST. * The implementation in `\source` files is parsed. This is done in a separate AST. * Identifiers are "created" when they are first encountered during parsing. This includes references, not just definitions! Identifiers defined in a different AST (struct TopLevels) will be bound later, in the semantic verification phase. 5. For each dependency: * The currently installed version of each dependency is parsed (only the interface). Each one gets a separate AST. * Note that all interface dependencies must be specified in the module being compiled. So there isn't any need to check handle recursive dependencies. 6. Semantic verification begins (see `cslul_ll_start_phase` in `context.c`) * `tlverify.c` binds identifiers to definitions in interfaces of libraries. * `tlverify.c` verifies declarations. Type definitions are verified by `typechk.c`. * `tlverify.c` calls `check_funcbody` in `funcchk.c` on each function body. This will also check that variables are assigned before use, etc. * Expressions are verified by `exprchk.c`. * Type compatibility is checked by `typecompat.c`. 7. `ir.c` generates IR from the AST(s). 8. `bwrapper.c` asks the backend (CSBE) to generate output file contents. Public header file (note that the API is **totally UNstable**): cslul.h - The interface used by main.c to perform compilation. Internal header files: ast.h - Structures in the AST backend.h - Functions in bwrapper.c, that then calls CSBE. defaults.h - Defaults paths on POSIX platforms (not used on Windows). errors.h - Compiler error codes + messages. hash.h - Pre-computed hashes of SLUL keywords. internal.h - Non-static internal functions and types. tokencase.h - "switch/case groups" of tokens. The runtime library - `libslulrt.so`/`slulrt.dll` --------------------------------------------- The runtime library will contain the following functionality: * Initialization of the SlulApp object and the root arena. * Management of arenas. * Wrappers around memcpy and memcmp. * String functions. * Maybe lists functions also. * System functions (e.g. file I/O, network functions, etc.) Public header file: include/slulrt.h - Definitions for accessing slulrt from C The Makefile ------------ The makefile supports common Makefile variables such as DESTDIR, prefix, srcdir, etc. See `notes/build_defines.txt` for a summary. There are some system-specific makefiles, e.g. `Makefile.bsd`, that set some appropriate variables for the given system and then include the main makefile. A set of fast tests can be run with: make -s -j4 check If you have TCC installed, you can run (most of the) tests with bounds-checking enabled: make -s tcc-boundscheck The tests can be run with Valgrind (use VALGRIND_OPTS=... to set options): make -s -j4 check-valgrind A full check + scan, using several analysis tools, can be run. This can take over 30 minutes on slow devices. make -s -j4 scan-all If running `make` outside the source root directory, you need to use either the `-C` option or set `srcdir`. If the source and build directories are different, you need to run `make outdirs` before running any other make commands. Examples: # Using -C make -s -j4 -C .. check # Using srcdir make -s srcdir=/home/user/Code/slul outdirs make -s -j4 srcdir=/home/user/Code/slul -f /tmp/slul/Makefile check Appendix: Descriptions of all .c files -------------------------------------- Note that all `unittest/test_*.c` files are omitted. Those are all tests of the corresponding `.c` file in the parent directory. This listing can be generated with `make -s source-overview`. In src-backend: analyze.c -- IR analysis functions datastruct.c -- Functions for creating CSBE data structures init.c -- Initialization for CSBE output.c -- Output generation In src-backend/outformat: elf.c -- ELF file handling outformat_common.c -- Common functions for ELF/PE output raw.c -- Raw output format. Used to dump a textual IR In src-backend/codegen: aarch64.c -- Code generator for Aarch64 codegen_common.c -- Common functions for the code generators irdump.c -- Dumps IR in text form x86.c -- Code generator for i386 and x86_64 In src-cslul: arch.c -- Handling of targets/multiarch arena.c -- Arena allocator build.c -- The main build function builtins.c -- Sets up built-in definitions bwrapper.c -- Wrapper around the backend (CSBE) and IR generator chkutil.c -- Utility functions for the semantic checker config.c -- Functions for configuring compilation contexts. context.c -- Handles compilation context state errors.c -- Builds table for error message strings exprchk.c -- Expression checker funcchk.c -- Checking of function bodies ir.c -- Generates Intermediate Representation (IR) main.c -- Entry point for CSLUL compiler mhparse.c -- Parsing of module headers mhtoken.c -- Tokenization of module headers misc.c -- Miscellaneous functions parse.c -- Parsing of a token stream to an AST platform.c -- Platform dependent code print_hashes.c -- Generates pre-computed hashcodes for hash.h tlverify.c -- Verifies top-level symbols token.c -- Tokenization of source tree.c -- AVL tree map for hashed items typechk.c -- Type checker typecompat.c -- Type compatibility checker In src-cslul/fuzz: aflmain.c -- Special entry point for fuzzing C-SLUL In src-cslul/testgen: testgen.c -- Generates a very large source file with random functions In src-cslul/winlibc: winlibc.c -- Basic (incomplete) C library for Windows with UTF-8 support In src-runtime: rtarena.c -- arena management functions rtinit.c -- Initialization functions for the runtime