1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
Many formats are quite complex and have syntax-based typing. E.g. YAML, TOML
INI-like formats are simple but different parsers have different limitations:
* spaces in [sections] might not be allowed
* properties and [sections] may be case insensitive
- some implementations are, some aren't.
* escaping special characters in keys is usually not possible
* duplicate properties might be disallowed
- GLib merges them!
* duplicate sections might be disallowed (good, would just be confusing anyway)
* leading and trailing spaces seem to be removed, but spacing around = seems to be implementation dependent (but I think most remove it)
- https://cloanto.com/specs/ini/ allows spacing around the =
* some specs disallow non-ASCII property names. Some probably use latin1?
* in keys, many special characters may be problematic: . " \ _ etc.
- but . appears in filenames!
* are properties without = allowed???
- in some formats yes
- in libconfini, these are called "implicit keys"
SLUL format:
* Avoid spaces in [sections]
* lowercase or Capitalized?
* multiple word sections? multi_word or MultiWord?
* Comment character, both # ; or only one?
* Disallow comments in values? Would require escaping.
- but on the other hand, many libraries parse them,
so allowing #; in values will break those libraries
(and only seldom, which is even worse)
- https://cloanto.com/specs/ini/ seems to only allow ; comments after a space or at the start of a line
(which is good to enforce if comments are allowed)
- the same applies to quotes... perhaps " should simply be disallowed in values?
- also, warn about repeated spaces in values? some implementations remove repeated spaces
* only allow lowercase filenames in files and properties
- also solves issues when transferring files between case
sensitive/insensitive file systems.
- CS -> CI => duplicate files issue
- CI -> CS => wrong case -> file not found
* only allow a-z0-9_ in file names, excluding file extension
- can't lowercase otherwise (it is tricky with Unicode, and locale dependent)
- spaces and special characters are not allowed by all ini parsers
(e.g. git-config does not seem to allow underscores)
- filesystems have restrictions on the allowed characters. don't want portability issues.
- strange Unicode characters could be a security issue (e.g. lookalikes, or RTL)
- special characters could be a security issue (e.g. quotes, control characters, etc.)
- some INI parser implementations parse stuff before a . as a section, [may be a non-issue, since those usually merge the section+key into a string]
so perhaps the file extension should be omitted?
but this is bad for usability/discoverability (i.e. what the sections should contain)
* Encoding?
- UTF-8 but only allowed in comments
- Exception for "author", "description" etc.
- If UTF-8 is allowed, then we need to restrict control characters, RTL etc.
(control characters need to be limited regardless. don't want to allow NUL characters for example)
- If we move out translations and author names, then we could require all data to be ASCII
* Print warning if BOM is present
* Line continuations?
- Needed if lines can get long
- But the best would be if long lines can be avoided
- Regardless, it makes sense to have length limits.
(for example, a 1MB module name is only a problem)
- 50 ASCII characters for names/identifiers/filenames
- Values that allow UTF-8 should have a limit in bytes
- 50 bytes for ALL values??
- If disallow, we should report errors when possible (e.g. indented key, invalid symbol in value)
* Internationalization?
- Multi-language support for properties?
- GLib uses Property[xx]=Value for this
* Avoid duplicate keys.
[dependencies]
gtk=gtk2 or gtk3
# versions can be specified
gtk3=3.2
somelib=any
# if a version starts with a letter, write it like this:
somelib=somelib A1.1
# syntax examples:
# lib1=1.0 or lib1x 1.0
# lib2=lib2 1.0 or lib2x 1.0
# and maybe:
# lib3=lib3 1.0 !1.1-1.1.5
# lib3=lib3 1.0 (not 1.1-1.1.5)
# the same library name may not be specified (explicitly or implicitly) twice
#
* Disallow duplicate sections
* The most important properties should have = in them, so they can be parsed
by tools that might use some library that requires them.
- file names don't need this. those should only be used by the compiler
- names, versions, urls might need it.
|