summary refs log tree commit diff
path: root/transform.e
diff options
context:
space:
mode:
authorIrene Knapp <ireneista@irenes.space>2026-05-15 16:19:07 -0700
committerIrene Knapp <ireneista@irenes.space>2026-05-15 16:19:07 -0700
commit1af849abc637c4890285c4d3cc08d99faae2ea41 (patch)
tree0062b680a357a0820cafe8ce8f2c3f3f65ee99a2 /transform.e
parent13ec0a067a70ee1e5c0a287972c2f9f07d4e1486 (diff)
document the assumptions between labels and transforms
also some minor cleanup that ie. fixes alignment padding

Force-Push: yes
Change-Id: Ia8fcb9a44e05e37751cc1b8334aa72add7e62353
Diffstat (limited to 'transform.e')
-rw-r--r--transform.e46
1 files changed, 37 insertions, 9 deletions
diff --git a/transform.e b/transform.e
index a34e4a2..1c325df 100644
--- a/transform.e
+++ b/transform.e
@@ -88,12 +88,43 @@
   dup transform-state-saved-latest 0 swap ! ;
 allocate-transform-state s" transform-state" variable
 
-~   We allow immediate words to run, which means for example that flow-control
-~ hexdump-between words such as if-else expect to be able to mutate the heap,
-~ and they expect to find the stuff they've been compiling present on it. We
-~ make this possible by swapping out the value of "here" during the execution
-~ of transformed code. By focusing on "here", we get to keep the existing
-~ ",".
+~   The label transform operates on code that compiles itself, and ensures
+~ that the result of the compilation is suitable to be included in an
+~ executable binary. To achieve this, it makes several changes to the
+~ semantics of that code. The transform relies on the label facility, and
+~ expects to run from within label-loop.
+~
+~   The most fundamental change is that the label transform separates words
+~ that run in compile mode from words that run immediately.  There is no
+~ distinction made between words running in immediate mode, and words declared
+~ as immediate. Immediate words are looked up and executed based on their
+~ "real", currently-executing definitions. Compiled words, including
+~ literals, are looked up via the label facility.
+~
+~   Since the label facility is able to resolve forward references, there is
+~ no hard requirement that words be topologically sorted, but forward
+~ references should still be kept to a minimum, since that's a significant
+~ difference from un-transformed code that could easily become confusing.
+~
+~   Compilation words do make extensive reference to the global variables
+~ "here" and "latest". In particular, flow-control words such as if-else
+~ expect the log to have recent compilation outputs on it, and to be able to
+~ mutate them in-place. In order to make this work, we provide temporary
+~ values of these two variables which point to the location of the output
+~ buffer. This allows pointer resolution to work correctly without additional
+~ effort, but notice that the buffer's address will differ from the address
+~ the resulting program loads itself at. There's no simple way to avoid this
+~ concern, since the variables must point to one of those addresses or the
+~ other, not both.
+~
+~   We resolve the issue by running our own, alternate versions of the words
+~ "create", ":", ";", and ";asm" which use the label facility to compute the
+~ addresses that will be needed at runtime. These alternates run instead of
+~ the normal versions of these words. The code being compiled is responsible
+~ for not doing anything else that would rely on "here" and "latest" matching
+~ their runtime addresses, though it is otherwise allowed to modify and rely
+~ on them in all the usual ways. The alternate versions are defined in this
+~ file as their own words, "Lcreate", "L:", "L;", and "L;asm".
 ~
 ~ (-- done)
 : transform-one
@@ -188,9 +219,6 @@ allocate-transform-state s" transform-state" variable
   ~ TODO the arguments for this seem to be backwards from the documentation
   swap attach-string-to-input-buffer
 
-  ~ Set the initial inner value of "latest".
-  0 transform-state transform-state-saved-latest !
-
   ~   Save the old values of "here" and "latest", and set the initial values
   ~ of the internal ones. These values need to persist across iterations,
   ~ since client code will make its own updates to them and then rely on those