diff options
| author | Irene Knapp <ireneista@irenes.space> | 2026-05-15 16:19:07 -0700 |
|---|---|---|
| committer | Irene Knapp <ireneista@irenes.space> | 2026-05-15 16:19:07 -0700 |
| commit | 1af849abc637c4890285c4d3cc08d99faae2ea41 (patch) | |
| tree | 0062b680a357a0820cafe8ce8f2c3f3f65ee99a2 | |
| parent | 13ec0a067a70ee1e5c0a287972c2f9f07d4e1486 (diff) | |
document the assumptions between labels and transforms
also some minor cleanup that ie. fixes alignment padding Force-Push: yes Change-Id: Ia8fcb9a44e05e37751cc1b8334aa72add7e62353
| -rw-r--r-- | evoke.e | 2 | ||||
| -rw-r--r-- | execution.e | 1 | ||||
| -rw-r--r-- | labels.e | 40 | ||||
| -rw-r--r-- | transform.e | 46 |
4 files changed, 68 insertions, 21 deletions
diff --git a/evoke.e b/evoke.e index 8a024ce..c34ab2a 100644 --- a/evoke.e +++ b/evoke.e @@ -2,7 +2,7 @@ ~ | ./quine > evoke && chmod 755 evoke && ./evoke 1024 read-to-buffer -: foo 5 . [ 4 . ] ; +: foo ;asm pyrzqxgl s" source-to-precompile" variable diff --git a/execution.e b/execution.e index 04c859b..1100efd 100644 --- a/execution.e +++ b/execution.e @@ -597,6 +597,7 @@ 8 :rax add-reg64-imm8 :rax :rsi mov-reg64-reg64 pack-next + 8 packalign ; ~ This is the mechanism to "return" from a word interpreted by docol. diff --git a/labels.e b/labels.e index b524229..bfc85fb 100644 --- a/labels.e +++ b/labels.e @@ -33,6 +33,20 @@ ~ same value they started with, we say they've converged, and we announce ~ success. If a hundred passes go by without convergence, we fail instead. ~ +~ Importantly, the label facility does not have an opinion on where or how +~ the values of labels should be output, nor does it have concepts of +~ dependency tracking or topological sorting. It only cares about the sequence +~ of set- and use-operations. Code relying on the facility is free to make +~ arbitrarily complex decisions about the layout, size, contents etc of its +~ output, and to use label values as part of those decisions. It is the +~ responsibility of this relying code to ensure the process converges, by +~ whatever means it wishes. +~ +~ Since the label facility is not directly involved in output, it is also +~ permissible for code relying on it to make retroactive changes to the +~ output, as long as all such changes are completed before returning control +~ to label-loop. +~ ~ Most of the time, the shorthand words L@' and L!' will be all you need to ~ use from your own code. @@ -54,16 +68,9 @@ ~ ~ Just as the Evocation dictionary uses the global variable "latest" as a ~ handle (a pointer to a pointer) beginning a linked list of entries, so the -~ label dictionary also uses a handle variable, named "labels". - -~ TODO we should just do this in immediate mode, but right now the word -~ "variable" steps on the same scratch space that s" uses, so we can't. -: init-labels - 8 allocate s" labels" variable - 0 s" labels" find entry-to-execution-token execute ! - ; -~ This needs to happen now because otherwise the word "labels" won't exist. -init-labels +~ label dictionary also uses a handle variable, named "labels". We initialize +~ it here. +8 allocate s" labels" variable 0 labels ! ~ This is analogous to word-heading, but prints label information. ~ (entry pointer --) @@ -144,11 +151,22 @@ init-labels ~ ~ (new label value, label entry pointer --) : set-label + dup label-status @ + ~ (new value, entry, initial status) + + ~ Exit, and print a diagnostic, if the label was already set on this pass. + ~ + ~ Without cooperation from user code, there's no great way to communicate + ~ up the call stack that execution should stop, so we don't try to. + dup 0x02 & { drop swap drop + ." Failed by attempting to redefine " + entry-to-name emitstring newline exit } if + ~ We always set the defined bit to true. We leave the other status bits ~ as-is, except that we check whether it's unused up to now, and whether ~ the previous value equals the new value. If either of those holds, we set ~ the guessed-equals-actual bit to true. - dup label-status @ 0x02 | + 0x02 | ~ (new value, entry, updated status) 3unroll dup 4 unroll label-value @ swap dup 3unroll = ~ (entry, updated status, new value, equality) diff --git a/transform.e b/transform.e index a34e4a2..1c325df 100644 --- a/transform.e +++ b/transform.e @@ -88,12 +88,43 @@ dup transform-state-saved-latest 0 swap ! ; allocate-transform-state s" transform-state" variable -~ We allow immediate words to run, which means for example that flow-control -~ hexdump-between words such as if-else expect to be able to mutate the heap, -~ and they expect to find the stuff they've been compiling present on it. We -~ make this possible by swapping out the value of "here" during the execution -~ of transformed code. By focusing on "here", we get to keep the existing -~ ",". +~ The label transform operates on code that compiles itself, and ensures +~ that the result of the compilation is suitable to be included in an +~ executable binary. To achieve this, it makes several changes to the +~ semantics of that code. The transform relies on the label facility, and +~ expects to run from within label-loop. +~ +~ The most fundamental change is that the label transform separates words +~ that run in compile mode from words that run immediately. There is no +~ distinction made between words running in immediate mode, and words declared +~ as immediate. Immediate words are looked up and executed based on their +~ "real", currently-executing definitions. Compiled words, including +~ literals, are looked up via the label facility. +~ +~ Since the label facility is able to resolve forward references, there is +~ no hard requirement that words be topologically sorted, but forward +~ references should still be kept to a minimum, since that's a significant +~ difference from un-transformed code that could easily become confusing. +~ +~ Compilation words do make extensive reference to the global variables +~ "here" and "latest". In particular, flow-control words such as if-else +~ expect the log to have recent compilation outputs on it, and to be able to +~ mutate them in-place. In order to make this work, we provide temporary +~ values of these two variables which point to the location of the output +~ buffer. This allows pointer resolution to work correctly without additional +~ effort, but notice that the buffer's address will differ from the address +~ the resulting program loads itself at. There's no simple way to avoid this +~ concern, since the variables must point to one of those addresses or the +~ other, not both. +~ +~ We resolve the issue by running our own, alternate versions of the words +~ "create", ":", ";", and ";asm" which use the label facility to compute the +~ addresses that will be needed at runtime. These alternates run instead of +~ the normal versions of these words. The code being compiled is responsible +~ for not doing anything else that would rely on "here" and "latest" matching +~ their runtime addresses, though it is otherwise allowed to modify and rely +~ on them in all the usual ways. The alternate versions are defined in this +~ file as their own words, "Lcreate", "L:", "L;", and "L;asm". ~ ~ (-- done) : transform-one @@ -188,9 +219,6 @@ allocate-transform-state s" transform-state" variable ~ TODO the arguments for this seem to be backwards from the documentation swap attach-string-to-input-buffer - ~ Set the initial inner value of "latest". - 0 transform-state transform-state-saved-latest ! - ~ Save the old values of "here" and "latest", and set the initial values ~ of the internal ones. These values need to persist across iterations, ~ since client code will make its own updates to them and then rely on those |