diff options
Diffstat (limited to 'transform.e')
| -rw-r--r-- | transform.e | 46 |
1 files changed, 37 insertions, 9 deletions
diff --git a/transform.e b/transform.e index a34e4a2..1c325df 100644 --- a/transform.e +++ b/transform.e @@ -88,12 +88,43 @@ dup transform-state-saved-latest 0 swap ! ; allocate-transform-state s" transform-state" variable -~ We allow immediate words to run, which means for example that flow-control -~ hexdump-between words such as if-else expect to be able to mutate the heap, -~ and they expect to find the stuff they've been compiling present on it. We -~ make this possible by swapping out the value of "here" during the execution -~ of transformed code. By focusing on "here", we get to keep the existing -~ ",". +~ The label transform operates on code that compiles itself, and ensures +~ that the result of the compilation is suitable to be included in an +~ executable binary. To achieve this, it makes several changes to the +~ semantics of that code. The transform relies on the label facility, and +~ expects to run from within label-loop. +~ +~ The most fundamental change is that the label transform separates words +~ that run in compile mode from words that run immediately. There is no +~ distinction made between words running in immediate mode, and words declared +~ as immediate. Immediate words are looked up and executed based on their +~ "real", currently-executing definitions. Compiled words, including +~ literals, are looked up via the label facility. +~ +~ Since the label facility is able to resolve forward references, there is +~ no hard requirement that words be topologically sorted, but forward +~ references should still be kept to a minimum, since that's a significant +~ difference from un-transformed code that could easily become confusing. +~ +~ Compilation words do make extensive reference to the global variables +~ "here" and "latest". In particular, flow-control words such as if-else +~ expect the log to have recent compilation outputs on it, and to be able to +~ mutate them in-place. In order to make this work, we provide temporary +~ values of these two variables which point to the location of the output +~ buffer. This allows pointer resolution to work correctly without additional +~ effort, but notice that the buffer's address will differ from the address +~ the resulting program loads itself at. There's no simple way to avoid this +~ concern, since the variables must point to one of those addresses or the +~ other, not both. +~ +~ We resolve the issue by running our own, alternate versions of the words +~ "create", ":", ";", and ";asm" which use the label facility to compute the +~ addresses that will be needed at runtime. These alternates run instead of +~ the normal versions of these words. The code being compiled is responsible +~ for not doing anything else that would rely on "here" and "latest" matching +~ their runtime addresses, though it is otherwise allowed to modify and rely +~ on them in all the usual ways. The alternate versions are defined in this +~ file as their own words, "Lcreate", "L:", "L;", and "L;asm". ~ ~ (-- done) : transform-one @@ -188,9 +219,6 @@ allocate-transform-state s" transform-state" variable ~ TODO the arguments for this seem to be backwards from the documentation swap attach-string-to-input-buffer - ~ Set the initial inner value of "latest". - 0 transform-state transform-state-saved-latest ! - ~ Save the old values of "here" and "latest", and set the initial values ~ of the internal ones. These values need to persist across iterations, ~ since client code will make its own updates to them and then rely on those |