Expressions =========== An expression is either a value, an operation (such as addition) or one of the "special expressions". The type of an expression is usually determined by the type of the target, so in the following declaration byte x = expr; _expr_ would be taken to be of the **byte** type. Values ====== Identifiers ----------- An identifier in an expression references some kind of data (such as a variable or a constant). An identifier expression always has the type same type as the definition of the variable. See identifiers.md for an explanation of how identifiers work in LRL. Type identifiers ---------------- A type identifier is a special kind of identifier that has colon in front of it. Such identifiers are searched for in the namespace of the _target type_. For instance, let's assume that the "bool" type is defined like this typedef bool = enum(false, true); then the following declaration bool b = :false; would search for the identifier "true" in the bool typedef. Type identifiers may also reference functions or data definitions in the namespace (and nested namespaces) of the target type's definition. If the target type is not a typedef, then the search for a typedef will continue according to the following rules, which are applied recursively: - **pointer type**: continue with the type that the pointer type points to. - **optional type**: continue with the value type. - **array type**: continue with the element type. - **parametric type**: bind the type parameters and continue with the base type (which should be a typedef). - **enum type**: search among the identifiers of the values. - otherwise, stop and report an error. Numbers ------- LRL currently supports decimal and hexadecimal integers, as well as floating points numbers. The type of an integer is determined by the number, larger numbers need larger integer types, and signed numbers need signed integer types. For example: 0 // type is eint8 (sign-less 8-bit integer) 127 // type is eint8 128 // type is uint8 (unsigned 8-bit integer) -128 // type is int8 (signed 8-bit integer) // Hexadecimal is written like this 0x7f // type is eint8 -0x80 // type is int8 0xff // type is uint8 0x100 // type is eint16 Optionally, numbers may be written with thousands separators like this: 1_000_000 See types.md for more information about the numeric types. There are also floating point number literals. For these you need to explicitly specify the type using an as-expression: 3.14 as float 3.14 as float32 3.14 as float64 3.14 as cfloat Floating point numbers may also be written with thousands separators, and may in addition have an power-of-ten exponent: 1_000.0 4_321.987_654 1e5 // 1 * 10^5 1.234e6 // 1.234 * 10^6 = 1 234 000 1_234.567e3 // 1 234.567 * 10^3 = 1 234 567 For NaN and Inf, see "Special values". Strings ------- A string is an array of bytes, and is written like this: "Hello" They are also directly translated to arrays of bytes in LRL. The encoding of the source files should always be UTF-8, so non-ASCII characters will use several bytes in the array. These bytes will have a byte value >= 128. **TODO:** Currently, strings are translated to a byte pointer. Change to an array? But strings should be null-terminated if the length is not specified (e.g. if we take the address directly, as in @"Hello") **TODO:** Enforce that strings (and identifiers) are actually UTF-8 Special characters such as quotes, newlines and backslashes can be escaped with a backslash. For example "Double quote \"\nBackslash \\". The following escape sequences are supported: \\ Backslash \" Double quote \a Alarm character (0x07) \b Backspace character (0x08) \f Form feed character (0x0C) \n Linefeed character (0x0A) \r Carriage return character (0x0D) \t Tab character (0x09) \v Vertical tab character (0x0B) \xNN Raw hexadecimal byte 0xNN \uNNNN Unicode character, 4 hexadecimal digits \UNN... Unicode character, 8 hexadecimal digits /\* Start of comment /* (which must be escaped) *\/ End of comment */ (which must be escaped) Arrays ------ A literal array value is written in square brackets, for instance like this: [1, 2, 3, 4] The type of the array elements is determined by the target type. For instance, in the following example the element values will be int32 values even though they could be eint8 values: int32#[4] data = [1, 2, 3, 4]; Array values can be nested: byte#[4,2] data = [[1,2], [2,3], [3,4], [4,5]]; It's allowed to put a comma after the last element, which is useful when an array spans multiple lines. For example, like this: byte#[4,2] data = [ [1,2,], [2,3,], [3,4,], [4,5,], ]; Structs ------- Struct values are written in parentheses with elements separated by commas, like this: (1, :true, 2, "test", 3.0, 4) Just like array values, struct values may be nested and a final trailing comma is allowed. Also, in structs with exactly one element you **have** to put a trailing comma (so the parser can distinguish it from a grouping parenthesis). For example: (123,:true,) // trailing comma (123,) // trailing comma is required if there's exactly one element ((1,2,3),1) // nested struct () // empty struct The types of the elements is determined by the members of the target type, which must be a struct type. For example: (int, bool) a = (x, y); // x must be an int, y must be a bool Special values -------------- In addition, LRL supports these special literal values: - **undefined** - this value can be used to indicate that the value should never be read from. It can be used by code analyzers to check that this is in fact the case. - **none** - this special value can be used with optional types, but also with "raw pointers" (see types.md). It's analogous to null, NULL or nil in other programming languages. - **NaN** - This is a special floating point value, which stands for "Not A Number". This is used to indicate an error in a calculation. - **Inf** - This is another a special floating point value. It is used to represent infinity, and can be both positive and negative (i.e. +Inf and -Inf, though the + is optional). Operators ========= Arithmetic Operators -------------------- Syntax: expr + expr expr - expr expr * expr expr / expr expr mod expr -expr +expr **TODO** special cases? **TODO** overflow/undeflow is not allowed, except in wuint types. **TODO** All operations except for "mod" require the target type to be known. **TODO** Binary Operators ---------------- Syntax: expr bitand expr expr bitor expr expr bitxor expr expr << expr expr >> expr compl expr **TODO** bitwise operations are not allowed on signed types **TODO** and compl and right shift is, in addition, not allowed on eint types (because the backing type could have sign bit) **TODO** check that "uint16 u = compl 1;" works, since 1 is an eint8 **TODO** Boolean Operators ----------------- Syntax: not expr expr and expr expr or expr expr xor expr **TODO** Comparison Operators -------------------- Syntax: expr == expr expr != expr expr < expr expr <= expr expr > expr expr >= expr **TODO** what rules apply to the types? e.g. you should be able to compare int/byte but not (int)/(byte) or int^/byte^ **TODO** Assignment Operators -------------------- Syntax: expr = expr expr += expr expr -= expr expr *= expr expr /= expr expr <<= expr expr >>= expr **TODO** Describe what the left hand side expression is, and what types of expressions are allowed. **TODO** All subexpressions are evaluated exactly once. **TODO** Assignment operations may not be nested inside expressions. **TODO** Multiple assignment is allowed with the = operator. Then the last expression will be read, and will be assigned in any order to the other expressions. **TODO** Address-of Operator ------------------- Syntax: @expr **TODO** Dereference Operator -------------------- Syntax: expr^ **TODO** Size and Offset Operators ------------------------- Syntax: sizeof expr minsizeof expr alignof expr offsetof expr **TODO** what about determining the size of a type. Should there something like "sizeof type int"? **TODO** enumbase Operator ----------------- Syntax: enumbase expr Extracts the base value of an enum value. For example: typedef Color = int enum (red=1, green=2, blue=3); Color g = :green; int i = enumbase g; The operand must be an enum type, but there's no specific target type (since the operand can be of any enum type). Note that the default base type of enums is "count" and not "int". makeopt Operator ---------------- Syntax: makeopt expr Wraps the expression in an optional value type. The target type of the operand is that of the target types value. The result is an optional type. Here's an example: bool? b = makeopt :false; **TODO** The C backend doesn't support makeopt for non-pointer types. Optional Operator ----------------- Syntax: expr? Extracts the value of an optional value. If the the expression is none, then that's an error and the behavior is undefined. The target type of the operand is an optional type of the target type of the operator. then-else Operator ------------------ Syntax: condexpr then trueexpr else falseexpr Evaluate either _trueexpr_ or _falseexpr_, depending on the value of _condexpr_. The result of the expression is that of the expression that was evaluated. The expression _condexpr_ must be of bool type. The expressions _trueexpr_ and _falseexpr_ have the same target type as the then-else expression. Special Expressions =================== Array Index Operation --------------------- Syntax: arrayexpr#[indexexpr] arrayexpr#[indexexpr,indexexpr...] An array index operation references the element at the given index in the array. Array indexes are zero-based, so the allowed range of array indexes in an array of type int#[5] is 0 to 4. The index expression must be of type "count", or a strictly smaller type (which must be smaller on any platform, which e.g. "int" isn't necessarily). Otherwise, you can use a **typeassert** expression: int i = 1; byte#[5] arr = [5,4,3,2,1]; byte x = arr#[i typeassert count]; // convert i into a count type **TODO** it would be nice to have a bounds check at the same time (and/or index types, e.g. typedef arr = byte#[5 indextype arrlentype]) See the section on the typeassert operation for more information. Elements in nested arrays may be accessed by using comma-separated indexes, like this: int#[3,2] nested = [[00,01], [10,11], [20,21]]; int x = nested#[2,1]; // = 21 int y = nested#[2]#[1]; // This is equivalent to the expression above Out of bounds accesses is an error and has undefined behavior. However, it is allowed to have a pointer point to the element (which doesn't exist) after the last element. In the example this index is 3 in the outer array and 2 in the inner arrays. But referencing anything inside a such non-existent element is not allowed. For example: int#[3,2] nested = [[00,01], [10,11], [20,21]]; int#[2]^ optr1 = @nested#[0]; // Pointer to normal array index int#[2]^ optr2 = @nested#[3]; // Pointer to index one step // after the last one int^ iptr1 = @nested#[0,2]; // Pointer to index one step after // the last one in the inner array. int^ iptr2 = @nested#[3,2]; // ERROR! Can't reference anything inside // a non-existent element. This has undefined // behavior. Struct Member Operation ----------------------- Syntax: structexpr.membername The struct member operator is similar to the array index operator, except that it works on structs. It's used to access a value in a struct. Here's an example: typedef Point = (int x, int y); () test() { var Point point; // Set a value in the struct point.x = 123; // Read a value from the struct int x = point.x; } Function Member Operation ------------------------- Syntax: expr->functionname **TODO** maybe change to expr:>functionname or expr.:functionname **TODO** also the current syntax makes it harder to add a -- operator (but x-->y can only be parsed as x-- > y) **TODO** should it look up typedefs instead of structs? **TODO** Call Operation -------------- Syntax: functionexpr() functionexpr(x) functionexpr(x,y) ... **TODO** as Operation ------------ Syntax: expr as type This operation sets the target type. It is useful in situations where the target type can't be determined, such as in comparison expressions: int a = 1; int b = 2; if a + b as int == 5-2 as int { // do something } It is only there to help the type checker determine the type. It can't be used to cast a type into another type. For that, use the typeassert expression. typeassert Operation -------------------- Syntax: expr typeassert type The typeassert operator casts the value of the expression into the given type, if the type can hold that value. Unlike the typeassert statement, the typeassert expression offers no way of handling cases where the value is not compatible with the type (e.g. out of range). Using the typeassert expression on incompatible values is an error, and produces undefined behavior. Here's an example: int i = 3; byte b = i typeassert byte; // int to byte int#[5] arr = [5,4,3,2,1] int v = arr#[i typeassert count]; // int to count Note that the result type of the expression to cast must be known. If not, please use the "as" expression, like this: int i = 3; // the type of "i + 1" is not known byte b = i + 1 as int typeassert byte;