1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
|
Code Overview
=============
Directory Overview
------------------
The source directories:
src-backend - The backend part. Also called CSBE (CSlul BackEnd)
|-- codegen - The code generators (e.g. aarch64)
\-- outformat - The output formats (e.g. ELF)
src-common - Common code for unit testing etc.
src-cslul - The compiler frontend (parser/analyzer/ir-gen) and CLI.
\-- winlibc - Minimal libc for WINE/Windows.
src-runtime - The runtime library for SLUL programs.
The directories with test SLUL source code:
errortest - Tests of errornouos syntax.
testexec - Test SLUL application and libraries.
In addition, the source directories contain `unittest` sub-directories
with a `test_*.c` corresponding to each `.c` source file.
Miscellaneous directories:
misc - Miscellaneous files
|-- icons - Very ugly icons. Should be replaced/improved :)
\-- syntax - Syntax highlighting definitions.
notes - Various random notes. Some may be irrelevant/obsolete.
or just some crazy ideas that will never be implemented.
The backend - CSBE
------------------
CSBE is a minimalistic self-contained compiler backend. It has the
following functionality:
* Functions for constructing an Intermediate Representation (IR)
* Machine code generation from the IR. Currently only aarch64.
* Output file generation from the machine code. Currently only ELF.
CSBE differs from other code generators (LLVM, QBE, ...) in these ways:
* There is no/minimal optimization. The goal is simplicity.
* The IR is *fully* architecture-neutral.
* There is no need for a toolchain/linker.
* There is no need for a sysroot. The IR has enough information
to generate symbol imports, without any external information
(i.e. libraries *don't* have to be installed at compile-time).
* **It is in a very, very early development phase :)**
Public header files (note that the API is **not** stable yet):
include/csbe.h - Public functions used by the frontend.
include/csbe_ops.h - Definitions of the IR operations.
Internal header files:
csbe_internal.h - Non-static internal functions and types.
codegen/codegen_common.h - Helper functions for the codegen.
outformat/outformat_common.h - Helper functions for ELF/PE.
The compiler frontend
---------------------
The compiler works in the following steps:
1. `main.c` parses the command line options, and initializes a
compilation context object, `struct CSlul`.
2. The compilation stages are handled in `build.c`.
3. `mhtoken.c` / `mhparse.c` parse the "module header" lines in `main.slul`.
4. The main parsing (`token.c` / `parse.c`) is then done as follows:
* If the module turns out to be an *application*,
* The code in `main.slul` and any `\source` files are parsed
into a AST (Abstract Syntax Tree).
* If the module turns out to be a *library*:
* The interface in `main.slul` is parsed to an AST.
* The implementation in `\source` files is parsed. This is
done in a separate AST.
* Identifiers are "created" when they are first encountered during
parsing. This includes references, not just definitions!
Identifiers defined in a different AST (struct TopLevels) will
be bound later, in the semantic verification phase.
5. For each dependency:
* The currently installed version of each dependency is parsed
(only the interface). Each one gets a separate AST.
* Note that all interface dependencies must be specified in the module
being compiled. So there isn't any need to check handle recursive
dependencies.
6. Semantic verification begins (see `cslul_ll_start_phase` in `context.c`)
* `tlverify.c` binds identifiers to definitions in interfaces of
libraries.
* `tlverify.c` verifies declarations. Type definitions are verified
by `typechk.c`.
* `tlverify.c` calls `check_funcbody` in `funcchk.c` on each
function body. This will also check that variables are
assigned before use, etc.
* Expressions are verified by `exprchk.c`.
* Type compatibility is checked by `typecompat.c`.
7. `ir.c` generates IR from the AST(s).
8. `bwrapper.c` asks the backend (CSBE) to generate output file
contents.
Public header file (note that the API is **totally UNstable**):
cslul.h - The interface used by main.c to perform compilation.
Internal header files:
ast.h - Structures in the AST
backend.h - Functions in bwrapper.c, that then calls CSBE.
defaults.h - Defaults paths on POSIX platforms (not used on Windows).
errors.h - Compiler error codes + messages.
hash.h - Pre-computed hashes of SLUL keywords.
internal.h - Non-static internal functions and types.
tokencase.h - "switch/case groups" of tokens.
The runtime library - `libslulrt.so`/`slulrt.dll`
---------------------------------------------
The runtime library will contain the following functionality:
* Initialization of the SlulApp object and the root arena.
* Management of arenas.
* Wrappers around memcpy and memcmp.
* String functions.
* Maybe lists functions also.
* System functions (e.g. file I/O, network functions, etc.)
Public header file:
include/slulrt.h - Definitions for accessing slulrt from C
The Makefile
------------
The makefile supports common Makefile variables such as DESTDIR, prefix,
srcdir, etc. See `notes/build_defines.txt` for a summary. There are some
system-specific makefiles, e.g. `Makefile.bsd`, that set some appropriate
variables for the given system and then include the main makefile.
A set of fast tests can be run with:
make -s -j4 check
If you have TCC installed, you can run (most of the) tests with
bounds-checking enabled:
make -s tcc-boundscheck
The tests can be run with Valgrind (use VALGRIND_OPTS=... to set options):
make -s -j4 check-valgrind
A full check + scan, using several analysis tools, can be run. This can take
over 30 minutes on slow devices.
make -s -j4 scan-all
If running `make` outside the source root directory, you need to use
either the `-C` option or set `srcdir`. If the source and build directories
are different, you need to run `make outdirs` before running any other
make commands. Examples:
# Using -C
make -s -j4 -C .. check
# Using srcdir
make -s srcdir=/home/user/Code/slul outdirs
make -s -j4 srcdir=/home/user/Code/slul -f /tmp/slul/Makefile check
Appendix: Descriptions of all .c files
--------------------------------------
Note that all `unittest/test_*.c` files are omitted. Those are all tests of
the corresponding `.c` file in the parent directory.
This listing can be generated with `make -s source-overview`.
In src-backend:
analyze.c -- IR analysis functions
datastruct.c -- Functions for creating CSBE data structures
init.c -- Initialization for CSBE
output.c -- Output generation
In src-backend/outformat:
elf.c -- ELF file handling
outformat_common.c -- Common functions for ELF/PE output
raw.c -- Raw output format. Used to dump a textual IR
In src-backend/codegen:
aarch64.c -- Code generator for Aarch64
codegen_common.c -- Common functions for the code generators
irdump.c -- Dumps IR in text form
x86.c -- Code generator for i386 and x86_64
In src-cslul:
arch.c -- Handling of targets/multiarch
arena.c -- Arena allocator
build.c -- The main build function
builtins.c -- Sets up built-in definitions
bwrapper.c -- Wrapper around the backend (CSBE) and IR generator
chkutil.c -- Utility functions for the semantic checker
config.c -- Functions for configuring compilation contexts.
context.c -- Handles compilation context state
errors.c -- Builds table for error message strings
exprchk.c -- Expression checker
funcchk.c -- Checking of function bodies
ir.c -- Generates Intermediate Representation (IR)
main.c -- Entry point for CSLUL compiler
mhparse.c -- Parsing of module headers
mhtoken.c -- Tokenization of module headers
misc.c -- Miscellaneous functions
parse.c -- Parsing of a token stream to an AST
platform.c -- Platform dependent code
print_hashes.c -- Generates pre-computed hashcodes for hash.h
tlverify.c -- Verifies top-level symbols
token.c -- Tokenization of source
tree.c -- AVL tree map for hashed items
typechk.c -- Type checker
typecompat.c -- Type compatibility checker
In src-cslul/fuzz:
aflmain.c -- Special entry point for fuzzing C-SLUL
In src-cslul/testgen:
testgen.c -- Generates a very large source file with random functions
In src-cslul/winlibc:
winlibc.c -- Basic (incomplete) C library for Windows with UTF-8 support
In src-runtime:
rtarena.c -- arena management functions
rtinit.c -- Initialization functions for the runtime
|