1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
|
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ Bootstrapping the log ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
~
~ The log is the main region of memory within which most dynamic allocation
~ happens. It's a single contiguous segment of virtual memory, which is
~ requested from the kernel when Evocation starts up. Almost all of
~ Evocation's dynamic data is kept in the log, including the main dictionary;
~ several important global variables which make it possible to find and
~ allocate other data structures; and the control stack.
~
~ This file has the task of providing words which are useful for working
~ with the log, and more specifically which are useful for helping to bring
~ the log into existence. Once the log exists, it can be used to manage
~ itself, but there's a bootstrapping challenge in getting there. That
~ challenge is solved by the warm-start routine in execution.e, which relies
~ on the words in this file and should load after it.
~
~ Some modern Forths, including Jonesforth, refer to the log as the heap.
~ This is a misnomer; a heap is a data structure that allows non-contiguous
~ allocation. Although there are Forths that have true heaps, Evocation is not
~ one of them. Space in the log is allocated by incrementing the "here"
~ variable (one of those important globals), which necessarily can only
~ allocate contiguous blocks; there is no way to compact allocations to
~ reclaim fragmented, unused space in between them. Evocation does allow
~ deallocation using "forget", but this is done by resetting "here" and
~ "latest" to older values, unwinding every allocation that's been done since
~ the point in time they return to.
~
~ It would be a mistake to confuse this allocation strategy with the
~ more-general facilities for allocation, reallocation, and deallocation of
~ individual memory blocks that many other languages have. To avoid confusion,
~ we stay away from the name "heap", though it may still occasionally be used
~ colloquially because it's familiar from other Forths, and because most
~ programming languages have a heap as the main memory segment they request
~ from the kernel.
~
~ In the strictest technical sense, the log is a stack: Things are added
~ to the end of it, and removed from that same end. However, Evocation already
~ has two other stacks, the control and value stacks. Adding to the potential
~ confusion, the control stack is actually stored inside the log (as a
~ fixed-size chunk at the bottom). However, the log isn't really that much
~ like a stack when you look at how it's actually used. Unlike Evocation's
~ control and value stacks, data structures on the log tend to be rich and
~ complex, interlinked in various ways through the use of pointers. They also
~ tend to be long-lived, with the log tending to grow over time, whereas the
~ control and value stacks tend to remain roughly the same size through cycles
~ of growth and shrinking. In order to be able to speak precisely about what
~ we're doing, we introduce the name "log" to refer to the entire memory
~ segment and everything stored within it.
~
~ Another linguistic choice we make is to be clear about dictionaries. A
~ dictionary is a linked list of word entries. Each dictionary has a specific
~ handle, a pointer to a pointer, which is the root of the list. Each
~ word entry begins with a specific data structure, which among other things
~ includes a next-entry pointer, a flags byte, and a string that serves as
~ the entry's name. Older entries in a dictionary seldom change; newer entries
~ are added at the beginning of it, with their next-entry pointers leading to
~ the older entries. It is possible for several dictionaries to exist at once,
~ each with its own dictionary handle.
~
~ Since dictionaries are managed using pointers to individual entries, there
~ is no specific requirement about the order in which those entries occur in
~ memory or where they are allocated, but usually a new entry is allocated at
~ the end of the log, by incrementing the variable "here", in the same manner
~ as any other allocation. There is one particular dictionary, the main
~ dictionary, whose handle is the variable "latest". The main dictionary holds
~ every executable word that can be used normally via Evocation's interpreter.
~
~ Since the main dictionary is by far the most important thing in the log,
~ it can be tempting to conflate the log with the main dictionary. This is
~ accurate enough for some purposes, but note that other dictionaries are
~ often interleaved with it, their allocations entwining like grape vines even
~ while each remains separate, reachable only via its own root. See the
~ machine label facility, in labels.e, for an example of how a secondary
~ dictionary can be useful.
~
~ This may feel tangential, but it's important background and there's no
~ better place to explain it: A handle is a pointer to a pointer. The variable
~ "latest" returns a handle, a fixed address which always holds the pointer to
~ the root entry of the main dictionary. Dereferencing that handle gives you
~ the dictionary pointer, the address of the root entry, which is suitable to
~ pass to find-in and similar words that read the dictionary's contents. When
~ you want to add a new entry to a dictionary, you need the dictionary's
~ handle, so that the root pointer can be changed. When you only want to write
~ it, you only need the regular single pointer.
~
~ When reading the documentation of words that work with dictionaries, pay
~ close attention to whether their parameters include a dictionary handle, or
~ a dictionary pointer.
~
~ The term "handle" was widely known in the early days of microcomputing,
~ when memory-safe languages without direct pointer access were less common.
~ Today it is usually considered specific to systems programming, the type of
~ programming which lies beneath other software and deals with topics such as
~ memory management and processes. Evocation is a systems-programming
~ language, in the sense that it takes pains to not introduce mandatory
~ abstractions which would make it difficult or inefficient to work directly
~ with these topics. So, in understanding Evocation, it's important to know
~ about handles.
~
~ Some of these bootstrap words rely on being able to invoke assembler words
~ that output machine code. Therefore, those words must be available at
~ runtime. Since nothing can be dynamically available at runtime until after
~ we've already run the log-load routine, which relies on the stuff in this
~ file, the assembler words must be statically available via the label
~ transform. That means their definitions in arm64.e must be loaded before
~ this file.
~ This has the same value as the constant control-stack-size, which is
~ defined in execution.e. Everything will break if it doesn't.
~
~ TODO: remove one of them. Probably the other one.
: log-offset 0x10000 ; ~ 64 KiB
~ (log address -- log address, "log" pointer)
: log-load-log
dup log-offset + ;
~ (log address -- log address, "s0" pointer)
: log-load-s0
dup log-offset + 8 + ;
~ (log address -- log address, "r0" pointer)
: log-load-r0
dup log-offset + 2 8 * + ;
~ (log address -- log address, "latest" pointer)
: log-load-latest
dup log-offset + 3 8 * + ;
~ (log address -- log address, "here" pointer)
: log-load-here
dup log-offset + 4 8 * + ;
~ This is a helper used by warm-start, which invokes find-in using "latest".
~ It relies on being passed the root address of the log, which is used to find
~ the global variable "latest". It's inconvenient to keep a log pointer around
~ all the time, which is why we stop doing it as soon as possible, but during
~ Evocation's startup there's no alternative. This word is used extensively
~ by code that's been compiled via the log-load transform; see transform.e for
~ details.
~
~ It would be possible to unload this word after the log is created, but
~ there are rare situations in which it's still useful, such as injecting
~ Evocation into another process's address space. Plus, it's small. So, we
~ keep it around.
~
~ (log address, string pointer -- log address, entry pointer or 0)
: log-load-find
swap log-load-latest @ swap 3unroll swap find-in ;
~ In the code generated by the log-load transform, it's convenient to have
~ only a single step needed to look up a word's execution token. This helper
~ does log-load-find, then gets the execution token if an entry is found.
~
~ (log address, string pointer -- log address, execution token or 0)
: log-load-find-execution-token
dup 3unroll log-load-find dup
{
3roll drop
entry-to-execution-token
} {
drop swap
." No such word: " emitstring newline
0
} if-else ;
~ This is the same as "create", from dynamic.e, except that it takes the
~ log's address as a parameter rather than hardcoding it, so that it can be
~ used in situations where the normal compilation process isn't yet available.
~
~ The requisite stack juggling is kind of finicky, sorry if it's hard to
~ read, but it's doing the same steps in the same order as the regular
~ "create".
~
~ (log address, string pointer -- log address)
: log-load-create
dup stringlen 1 + dup 3unroll
~ (log address, name field length, string pointer, name field length)
3 pick log-load-here swap drop @ 10 + 3unroll memmove
~ (log address, name field length)
over log-load-here swap drop @
~ (log address, name field length, output point)
2 pick log-load-latest swap drop @ pack64
~ (log address, name field length, output point)
0 pack8
0 pack8
+
~ (log address, output point)
8 packalign
~ (log address, output point)
over log-load-here swap drop @
~ (log address, output point, old here value)
2 pick log-load-latest swap drop !
~ (log address, output point)
over log-load-here swap drop ! ;
~ This is the same as ",", from dynamic.e, except that it takes the log's
~ address as a parameter rather than hardcoding it, so that it can be used in
~ situations where the normal compilation process isn't yet available.
~
~ Again, the stack juggling is kind of a lot, sorry about that.
~
~ (log address, value -- log address)
: log-load-comma
swap log-load-here swap 3unroll
~ (log address, value, here)
@ swap pack64
~ (log address, updated here value)
swap log-load-here swap 3unroll
~ (log address, updated here value, here)
! ;
~ This is the same as `;asm`, from dynamic.e, except that it takes the
~ log's address as a parameter rather than hardcoding it, so that it can be
~ used in situations where the normal compilation process isn't yet available.
~
~ Its two main responsibilities are to call `pack-next`, from
~ execution-support.e, and to overwrite the codeword. It also deals with
~ alignment.
~ (log address)
: log-load-semicolon-assembly
log-load-here @
~ (log address, output point)
pack-next
8 packalign
~ (log address, output point)
swap log-load-here swap 3unroll !
~ (log address)
log-load-latest @
~ (log address, entry pointer)
entry-to-execution-token
dup 8 + swap ! ;
~ This is the same as "variable", from dynamic.e, except that it takes the
~ log's address as a parameter rather than hardcoding it, so that it can be
~ used in situations where the normal compilation process isn't yet available.
~
~ (log address, address for new variable word, string pointer -- log address)
: log-load-variable
3roll swap log-load-create
~ (address for new variable word, log address)
log-load-here swap 3unroll
~ (log address, address for new variable word, here)
dup @
~ (log address, address for new variable word, here, output point)
dup 8 + pack64
3roll :rax mov-reg64-imm64
~ (log address, here, output point)
:rax push-reg64
pack-next
8 packalign
swap ! ;
~ A keyword is a word that evaluates to its own address, which makes it
~ suitable for use as a constant. See more detail on that in dynamic.e,
~ where "keyword" is defined.
~
~ Unlike Common Lisp, the lexer doesn't create keywords for us, we have to
~ do it explicitly. If if that were to someday change, the log-load routine
~ would still need a way to do it, which is this.
~
~ It's kind of a pain to look up the appropriate "docol" from here, so we
~ do it in assembler instead.
~
~ (log address, string pointer -- log address)
: log-load-keyword
log-load-create
~ (log address)
log-load-here @ dup
~ (log address, self execution token, output point)
dup 8 + pack64
~ (log address, self execution token, output point)
swap :rax mov-reg64-imm64
~ (log address, output point)
:rax push-reg64
pack-next
8 packalign
~ (log address, output point)
swap log-load-here
~ (output point, log address, here)
swap 3unroll
~ (log address, output point, here)
! ;
~ This is a helper used by log-load-string-alternate. It does the usual
~ string packing thing, but at one layer of indirection more than usual. Its
~ responsibility includes alignment, unlike packstring.
~
~ (log address, string pointer -- log address)
: log-load-comma-string
swap log-load-here @ 3roll
~ (log address, output point, string pointer)
packstring
8 packalign
~ (log address, output point)
swap log-load-here 3roll swap
~ (log address, output point, here)
! ;
~ Now we have a bunch of words that are the back-ends for the log-load
~ transform's high-level flow control alternates. These implementations
~ closely parallel the non-transformed versions in flow-control.e, which
~ should be referenced in understanding them.
~
~ These variants are a bit unusual in their interfaces: They end with the
~ log address at the top of the stack, even when they have values to return.
~ That's because they're really just "talking" to each other; they don't need
~ to interact with anything else, and doing it this way saves the alternates
~ the work of swapping things around after.
~
~ Notice also that, because these run entirely at log-load time, they are
~ always dealing with target pointers and don't have to convert address
~ spaces.
~
~ (log address -- start pointer, log address)
: log-load-left-curly-brace
log-load-here @ swap ;
~ (start pointer, log address -- start pointer, length, log address)
: log-load-right-curly-brace
swap dup 3roll
~ (start pointer, start pointer, log address)
log-load-here @ swap
~ (start pointer, start pointer, end pointer, log address)
3unroll swap - swap ;
~ (start, length, log address -- log address)
: log-load-if
3unroll
~ (log address, start, length)
2dup swap dup
~ (log address, start, length, length, start, start)
5 8 * +
~ (log address, start, length, length, start, adjusted start)
3unroll swap
~ (log address, start, length, adjusted start, start, length)
memmove
~ (log address, start, length)
swap 3roll log-load-here dup @
~ (length, start, log address, here pointer, old here)
swap 4 roll swap
~ (length, log address, old here, start, here pointer)
!
~ (length, log address, old here)
3unroll
~ (old here, length, log address)
s" lit" log-load-find entry-to-execution-token log-load-comma
0 log-load-comma
s" !=" log-load-find entry-to-execution-token log-load-comma
s" 0branch" log-load-find entry-to-execution-token log-load-comma
~ (old here, length, log address)
swap dup 3unroll
~ (old here, length, log address, length)
8 + log-load-comma
~ (old here, length, log-address)
3unroll
~ (log address, old here, length)
drop 5 8 * +
~ (log address, new here)
swap log-load-here
~ (new here, log address, here pointer)
3roll swap ! ;
~ (start, length, log address -- log address)
: log-load-unless
3unroll
~ (log address, start, length)
2dup swap dup
~ (log address, start, length, length, start, start)
5 8 * +
~ (log address, start, length, length, start, adjusted start)
3unroll swap
~ (log address, start, length, adjusted start, start, length)
memmove
~ (log address, start, length)
swap 3roll log-load-here dup @
~ (length, start, log address, here pointer, old here)
swap 4 roll swap
~ (length, log address, old here, start, here pointer)
!
~ (length, log address, old here)
3unroll
~ (old here, length, log address)
s" lit" log-load-find entry-to-execution-token log-load-comma
0 log-load-comma
s" =" log-load-find entry-to-execution-token log-load-comma
s" 0branch" log-load-find entry-to-execution-token log-load-comma
~ (old here, length, log address)
swap dup 3unroll
~ (old here, length, log address, length)
8 + log-load-comma
~ (old here, length, log-address)
3unroll
~ (log address, old here, length)
drop 5 8 * +
~ (log address, new here)
swap log-load-here
~ (new here, log address, here pointer)
3roll swap ! ;
~ (true start, true length, false start, false length, log address
~ -- log address)
: log-load-if-else
5 unroll 2dup
~ (log address, true start, true length, false start, false length,
~ false start, false length)
swap dup 7 8 * + swap 3roll
~ (log address, true start, true length, false start, false length,
~ adjusted false start, false start, false length)
memmove
~ (log address, true start, true length, false start, false length)
4 roll dup 5 unroll
~ (log address, true start, true length, false start, false length,
~ true start)
4 roll dup 5 unroll
~ (log address, true start, true length, false start, false length,
~ true start, true length)
swap dup 5 8 * +
~ (log address, true start, true length, false start, false length,
~ true length, true start, adjusted true start)
swap 3roll
~ (log address, true start, true length, false start, false length,
~ adjusted true start, true start, true length)
memmove
~ (log address, true start, true length, false start, false length)
4 roll dup 5 unroll
~ (log address, true start, true length, false start, false length,
~ true start)
6 roll log-load-here @ 7 unroll
~ (old here, true start, true length, false start, false length, true start,
~ log address)
log-load-here 3roll swap !
~ (old here, true start, true length, false start, false length,
~ log address)
s" lit" log-load-find entry-to-execution-token log-load-comma
0 log-load-comma
s" !=" log-load-find entry-to-execution-token log-load-comma
s" 0branch" log-load-find entry-to-execution-token log-load-comma
~ (old here, true start, true length, false start, false length,
~ log address)
4 roll dup 5 unroll
~ (old here, true start, true length, false start, false length,
~ log address, true length)
3 8 * + log-load-comma
~ (old here, true start, true length, false start, false length,
~ log address)
3unroll
~ (old here, true start, true length, log address,
~ false start, false length)
swap dup 3unroll
~ (old here, true start, true length, log address,
~ false start, false length, false start)
5 8 * +
~ (old here, true start, true length, log address,
~ false start, false length, adjusted false start)
4 roll log-load-here 3roll swap !
~ (old here, true start, true length,
~ false start, false length, log address)
s" branch" log-load-find entry-to-execution-token log-load-comma
swap 8 + log-load-comma
~ (old here, true start, true length, false start, log address)
4 unroll
~ (old here, log address, true start, true length, false start)
drop drop drop
~ (old here, log address)
log-load-here
3roll 7 8 * + swap ! ;
~ (start, length, log address -- log address)
: log-load-forever
s" branch" log-load-find entry-to-execution-token log-load-comma
swap 8 + -1 * log-load-comma
swap drop ;
~ (test start, test length, body start, body length, log address
~ -- log address)
: log-load-while
5 unroll 2dup
~ (log address, test start, test length, body start, body length,
~ body start, body length)
swap dup 5 8 * + swap 3roll
~ (log address, test start, test length, body start, body length,
~ adjusted body start, body start, body length)
memmove
~ (log address, test start, test length, body start, body length)
5 roll log-load-here @ 6 unroll
~ (old here, test start, test length, body start, body length, log address)
log-load-here 4 roll dup 5 unroll swap !
~ (old here, test start, test length, body start, body length, log address)
s" lit" log-load-find entry-to-execution-token log-load-comma
0 log-load-comma
s" !=" log-load-find entry-to-execution-token log-load-comma
s" 0branch" log-load-find entry-to-execution-token log-load-comma
swap dup 3unroll 3 8 * + log-load-comma
~ (old here, test start, test length, body start, body length, log address)
log-load-here 5 8 * 8 roll + swap !
~ (test start, test length, body start, body length, log address)
s" branch" log-load-find entry-to-execution-token log-load-comma
5 unroll
~ (log address, test start, test length, body start, body length)
6 8 * + swap drop + swap drop -1 * log-load-comma ;
|