aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/format_strings.txt
blob: dba360a5aa197aa179526f04da3a8de7caaa8ea4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128

Format strings
==============

Safer alternatives to varargs / format strings:

1. Have an "mixed-element-type-array" type, with runtime type info.
2. Have an "to-string-element-array" type, where the compiler
   automatically inserts conversions to string for each element.
   (Or perhaps an "auto-converting-string" type)
3. Require multiple calls:

    system.out.param_int(123)
    system.out.param_bool(true)
    system.out.param_string("abcdef")
    system.out.writeln("Hello string={3}, int={1,right,width=4,pad= }, bool={2,false=No,true=Yes}")

    - How to handle lifetimes?
    - A possible solution:

    {
        FormatParams p = .new(arena)
        p.param_int(123)
        p.param_bool(true)
        p.param_string("abcdef")
        system.out.writefmt("Hello string={3}, int={1,right,width=4,pad= }, bool={2,false=No,true=Yes}", p)
    }

    - Underlying API types and functions:

        type FormatParams
        func .new(arena) -> FormatParams
        func FormatParams.param_...(...)
            ... = { int,uint,int64,uint64,fileoffs,size,bool,string }
        
        type Formatter
        func .new(arena, size buffsize) -> Formatter
        func Formatter.start(string fmt, ref FormatParams params)
        func Formatter.next(ref writeonly byte[len] buff, size len) -> ?size
            where return <= len
        # Someting like this would be cool
        #func Formatter.next(out byte[len] buff, size len) -> ?size
        #    where only buff[0..return] is initialized

Language changes:

    - allow ? with types where a none value can be reserved?
        ?byte, ?uint8..16, ?int16, ?size, ?fileoffs
    - array initialization constraints? (probably hard)
    - 

Type-checked format strings?
----------------------------

C-like syntax:

    func greeting(arena Stream out, string name, int number) {
        out.write_fmt(%"Hello %s<name>, the number is %02d<number>")
        out.write_fmt(%"Hello %s(name), the number is %02d(number)")
        out.write_fmt(%"Hello %s[name], the number is %02d[number]")
        out.write_fmt(%"Hello %s{name}, the number is %02d{number}")
        out.write_fmt(%"Hello %{s:name}, the number is %{02d:number}")
        out.write_fmt("Hello \{s:name}, the number is \{02d:number}")
    }

Pros:
+ Familar to C programmers
+ Could be used in places where strings are expected:
    %"..." where string is expected: gets formatted first.
    %"..." where a format string is expected: passed as format string.

Cons:
- Might need to mix lowercase and uppercase: Future C standards can add
  more lowercase letters, so any SLUL extensions have to use (or start
  with) an uppercase letter.
- Incompatible syntax with C. To be compatible with C:

Extensible / general purpose formatting
---------------------------------------

    string s = .format("Hello \{s:name}, this is \{d,2,.right:number}")
    string s = .format("Hello \{s:name}, this is \{d(2,.right):number}")
    string s = .format("Hello \{.s(name)}, this is \{.d(number, 2, .right)}")
    string s = .format("Hello \.s(name), this is \.d(number, 2, .right)")

    func .format(Format f) -> string
    func .s(string s) -> FormatItem
    func .d(int n, int num_digits, Align align) -> FormatItem


    string s = .format("Hello \.s(name), this is \.rnum(number, 2)")
    func .rnum(int n, int num_digits) -> FormatItem {
        return .d(n, num_digits, .right)
    }

Including the variable name
---------------------------

    int id = 123
    string name = "Jane"
    string msg = .format("Error. \={id}, \={name}")

Would result in:

    Error. id=123, name="Jane"

Out-of-band placeholders?
-------------------------

Typically, the placeholders for values are in-band:

    "Hello %s"
    "Hello {}"
    "Hello {0}"

Out-of-band placeholders have two advantages:

* Possibly better performance since the string does not have to be scanned.
* Can avoid some security issues where format strings come from untrusted
  sources (this should not cause memory errors in SLUL, but it could still
  cause other problems)

And disadvantages:

* It can create surprises when two strings can have the exact same
  byte representation, but still behave in different ways when used
  as format strings.
* Translation tools/libraries would have to support the out-of-band
  placeholders.