notes/language_server.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160


Langauge Server?
================

Problems:
* Incremental code analyzis
* LSP is more complex than necessary (json + jsonrpc + UTF-16 positions + ...)

Some links:

https://matklad.github.io/2022/04/25/why-lsp.html
https://rust-analyzer.github.io/blog/2020/07/20/three-architectures-for-responsive-ide.html
https://old.reddit.com/r/vim/comments/b3yzq4/a_lsp_client_maintainers_view_of_the_lsp_protocol/
https://news.ycombinator.com/item?id=27877845

https://clangd.llvm.org

https://en.wikipedia.org/wiki/Language_Server_Protocol
https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/

https://github.com/Valloric/ycmd
https://www.scala-lang.org/blog/2018/06/15/bsp.html
https://github.com/Microsoft/VSDebugAdapterHost  (DAP, debugger adapter protocol)


What can be improved?
---------------------

* Use a simple, binary, protocol
* Use byte offsets
* Use a three-component architecture:
    - Editor
    - "Base language helper" (with common functionality that is shared across languages and editors)
    - Language server
* Use a fully asynchronous protocol
    - Numbered requests
    - Numbered respones
    - Cancellation can be supported (but should be an optional feature)
* Serialization:
    - Allow the language servers to serialize data
    - Can be used for caching
    - Can be used for snapshots
    - Can be used to implement cancellation (snapshot - terminate process - restart from snapshot)
    - Can be used to implement parallelism (snapshot - clone - merge)
* Delta-serialization:
    - Language server sends back modifications to the "Base language helper"
    - "Base language helper" can normalize/simplify the "internal tree".
* Always send all messages?
    - Avoids the need for handshake messages
    - Just ignore any unsupported messages


Example
-------

Initialization:

 Editor->BLH:

    /*init*/0x0000 /*version*/0x0000 /*num_ext*/0x0001
    /*use_ext*/0x0001 /*/*extlen*/0x0004 /*extname*/"base"  (or use a string table?)

Key press:

 Editor->BLH:

 /*"insert"*/0x1234 /*line*/456 /*col*/12 /*end_line*/456 /*end_col*/12 /*ins_len*/1 /*ins*/"a"

 BLH->langserver:

 -dito-
 ...new token?...

 langserver->BLH:

 ...delta serialization...


Alternative design
------------------
Everything should be async

TODO shouldn't dependencies be tracked also?


Initialization:
1. Ed->LS:  Send source (includes filename)
2. Ed<-LS:  Reply with a list of tokens (id, startbyte, endbyte, [token-info])


Include file request (asynchronous): - can be used to re-request the main file
1. Ed<-LS:  Request file
2. Ed->LS:  Send source (includes filename)
3. Ed<-LS:  (optional) Reply with a list of tokens (id, startbyte, endbyte)

 --- Parsed Info ---

Error message (asynchronous):
1. Ed<-LS:  Error message (error-id, startbyte, endbyte)

Completion suggestion (asynchronous):
1. Ed<-LS:  Suggestion (suggestion-request-id, suggestion, suggestion-type, kind, have-more-results)
            suggestion-types:
                "one-possibility"
                "very-likely"
                "normal"
                "unlikely"

Fix suggestion:
1. Ed<-LS:   (error-id, description, [edit-operation...])

Code edit result from action (same as edit calls, but with an "action-request-id"):
1. Ed<-LS:  (action-request-id, token-id, offset, delete-len, insert-string)
1. Ed<-LS:  (action-request-id, token-before-or-null, offset, delete-len, insert-string)
1. Ed<-LS:  (action-request-id, file-id, byte-offset, delete-len, replace-with-string)

Token information (asynchronous):
1. Ed<-LS:  Token information (id, startbyte, endbyte, [token-info], add-to-existing, have-more-results)
            token-info: (kind, [(definition-type, definition-ref)...], [available-action...], [maybe-available-action...])
            (the same token can be redefined when there is more information)

Token range information (asynchronous):
#1. Ed<-LS:  Token range (start-id, end-id, type-ref(definition/builtin), name-ref(definition/string))
1. Ed<-LS:  Token range (start-id, end-id, [token-info], add-to-existing, have-more-results)
            FIXME cannot support holes in the range
            FIXME more kinds of references?
            FIXME cannot support multiple references of the same type

Declaration:
1. Ed<-LS   (definition-id, textual-name, kind, start-token-id, end-token-id, main-token-id)

"Body of declaration" / definition / implementation:
1. Ed<-LS   (definition-id, start-token-id, end-token-id, main-token-id)

 --- Edits ---

Edit inside token:
1. Ed->LS:   (token-id, offset, delete-len, insert-string)

Edit after token:
1. Ed->LS:   (token-before-or-null, offset, delete-len, insert-string)

Large delete/replace:
1. Ed->LS:  (file-id, byte-offset, delete-len, replace-with-string)

Request suggestions / Move cursor:
1. Ed->LS:  (suggestion-request-id, line, column)

Set area of interest:
1. Ed->LS:  (interst-type, token-start, token-end)
            interest-type:
                "visible-area"  = look quickly for info
                "hover"         = look hard
                "menu"          = look harder
                "definitions"   = look for definitions
                "usages"        = look for usages
                (note: the editor is responsible for filtering unwanted info!)

Perform action:
1. Ed->LS:  (action-id, token-id)
1. Ed->LS:  (action-id, token-start-id, token-end-id)