okay there's a strat for making the here/latest variables now

it doesn't work yet, but it's gonna the reason it doesn't work is that the new helper log-load-variable relies on the assembly-definition words being statically available, and they aren't yet that's fine though, this is still a huge change, worth checking in. why? well, it represents like 16 hours of debugging which culminated in some very minor changes to the semantics of the label transform, in order to make missing words easier to notice and debug. see comments for details. woooooo :D Force-Push: yes Change-Id: Id8334819d165ba9e3156ef2bf32008af748eac29
author: Irene Knapp <ireneista@irenes.space> 2026-05-18 16:36:08 -0700
committer: Irene Knapp <ireneista@irenes.space> 2026-05-18 16:36:08 -0700
commit: eec336dea3d86e176c4bd86c435e6be35fec64e2 (patch)
tree: 9d72ffb1ab138e693ef8de7cbfec58c10d04d412
parent: 5820c29532ba877ac50b331658a4b4dd7f138ded (diff)
6 files changed, 441 insertions, 140 deletions
diff --git a/core.e b/core.e
index b2062d1..0390812 100644
--- a/core.e
+++ b/core.e
@@ -8,8 +8,88 @@
 ~ of it is later copied into the log when that executable runs. Therefore, it
 ~ is written to obey the constraints of both the label transform, and the
 ~ log-load transform; see transform.e for more details on that.
+
+
+~ Execution support
+~ ~~~~~~~~~~~~~~~~~
+~
+~   There's two words, docol and exit, which are essential parts of the
+~ indirect-threaded execution model. It would be tempting to put them in
+~ execution.e, so they'd be closer to the explanation of what they do, but
+~ we need two copies of them (just like we do of every other word in this
+~ file), one statically compiled and one in the log. So, they're here, because
+~ that's significantly simpler, even though it creates a little bit of extra
+~ work for the label transform.
+
+
+~   Docol is the "interpreter" that is responsible for the semantics of words
+~ written as Forth high-level code. The name is said to be short for
+~ "do colon", because word definitions begin with a colon.
+~
+~   Concretely, when interpreting, it saves rsi (the "instruction pointer") to
+~ the control stack, takes the address of the codeword from rax and increments
+~ it in-place to form the new instruction pointer, and copies that to rsi.
+~
+~   Having then done this, we're now in the state that normal execution
+~ expects, so docol ends by it using "next" to begin the callee's execution,
+~ kicking off a nested call.
 ~
+~   At runtime, invoke docol directly and it will return the value that should
+~ be used as a codeword. When compiling to a binary executable, the
+~ transformation facility needs to reference it directly, and for that
+~ purpose, the label "docol-codeword-value" points to the correct place.
+~
+~ Registers in:
+~
+~ * rsi is the caller's instruction pointer
+~ * rbp is the control stack pointer
+~ * rax is the address of the callee's codeword
+~
+~ Registers out:
+~
+~ * rsi is the callee's instruction pointer
+~ * rbp is the control stack pointer
+
+~   We need this one snippet of assembly code that's just on the log raw,
+~ without a word header, because of its role in making words work. We define
+~ a label pointing to it.
 ~
+~   This is the only use of labels in core.e. There's no easy way to
+~ remove it. We can't use immediate computation based on "here" because
+~ the label transform uses the host address space, not the target address
+~ space. The transformation facility's support for labels in transformed
+~ code was added just for this.
+~
+~   It may seem as if we could have used some extra space in the middle of
+~ the proper docol word, which is defined just below. That would work fine
+~ with the label transform, which can do forward references, but the log-load
+~ transform's label support is special-cased to ONLY do this, and it will only
+~ work with a backward reference.
+here @
+dup L!' docol-codeword-value
+:rsi pack-pushcontrol
+8 :rax add-reg64-imm8
+:rax :rsi mov-reg64-reg64
+pack-next
+8 packalign
+here !
+
+: docol
+  [ here @
+    ~ Evaluated as a word, docol is a constant which returns a pointer.
+    L@' docol-codeword-value :rax mov-reg64-imm64
+    :rax push-reg64
+    here ! ] ;asm
+
+~   This is the mechanism to "return" from a word interpreted by docol.
+~ We pop the control stack, and then, since this is threaded execution, we
+~ do the next thing the caller wants to do, by inlining "next".
+: exit
+  [ here @
+    :rsi pack-popcontrol
+    here ! ] ;asm
+
+
 ~ Stack manipulation routines
 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~
@@ -1063,8 +1143,12 @@
 
 ~ (output point, alignment byte count -- output point)
 : packalign
-  { 2dup /% drop { drop exit } unless
-    swap 0 pack8 swap } forever ;
+  2dup /% drop 0branch [ 8 8 * , ]
+  swap 0 pack8 swap branch [ -11 8 * , ]
+  drop ;
+  ~ TODO this is the implementation we could use if we had more flow-control
+  ~ { 2dup /% drop { drop exit } unless
+  ~  swap 0 pack8 swap } forever ;
 
 
 ~ Binary unpacking
diff --git a/evoke.e b/evoke.e
index 607242d..869de31 100644
--- a/evoke.e
+++ b/evoke.e
@@ -7,8 +7,30 @@
 
 s" source-to-precompile" variable
 
+~ : fooze 4 . ; fooze
 1024 read-to-buffer
-: fooze 4 . ;
+here @
+dup L!' docol-codeword-value
+:rsi pack-pushcontrol
+8 :rax add-reg64-imm8
+:rax :rsi mov-reg64-reg64
+pack-next
+8 packalign
+here !
+
+: docol
+  [ here @
+    L@' docol-codeword-value :rax mov-reg64-imm64
+    :rax push-reg64
+    here ! ] ;asm
+
+: exit
+  [ here @
+    :rsi pack-popcontrol
+    here ! ] ;asm
+
+: foo 1 2 + ;
+0 sys-exit
 pyrzqxgl
 s" source-to-copy-to-log" variable
 
@@ -26,8 +48,6 @@ s" source-to-copy-to-log" variable
   elf-program-header
   output-cold-start
   source-to-copy-to-log output-warm-start
-  output-docol
-  output-exit
   source-to-precompile label-transform
   0 L!' final-word-name
   current-offset L!' total-size
diff --git a/execution.e b/execution.e
index daacddb..1b9e84d 100644
--- a/execution.e
+++ b/execution.e
@@ -484,16 +484,11 @@
   3unroll
   8 packalign
   current-offset L!' warm-start
-  3roll
-
-  log-load-transform
-
-  ~ TODO this is tied to the specific example in evoke
-  ~ L@' happy-path L@' origin + pack64
+  ~ (input string pointer, output buffer start, output point)
 
   ~   Before handing off to us, cold-start pushed a single value onto the
-  ~ stack, a pointer to the beginning of the heap. Now, we load our entire
-  ~ Forth implementation onto that heap, beginning with the minimal set of
+  ~ stack, a pointer to the beginning of the log. Now, we load our entire
+  ~ Forth implementation onto that log, beginning with the minimal set of
   ~ words needed to define more words. We do this because we need variables as
   ~ infrastructure so we can eventually have dynamic definitions.
   ~
@@ -514,117 +509,73 @@
   ~ That choice does mean we have the hard version of this bootstrapping
   ~ problem, and copying ourselves to the heap is how we solve it.
   ~
-  ~   We do have the heap address right now, though that won't last. In case
+  ~   We do have the log address right now, though that won't last. In case
   ~ it's unclear why not: keeping it on the stack would require all future
   ~ references to walk the stack, and somehow know when they've reached the
   ~ bottom. The stack is a good place to keep things with clearly delimited
   ~ lifetimes and visibility, but when we want something to live for our
   ~ entire program and be easy to find from any code within it, we need to
   ~ do something else. Anyway, since we have the address, we can use it for
-  ~ the next little bit of setup.
+  ~ the next little bit of setup. We have a bunch of helper words, from
+  ~ log-load.e, which make this easier.
   ~
   ~   The first few words we define are our variables, which hardcode the
   ~ addresses they will return - but since we're doing this at runtime,
-  ~ "hardcoding" can reflect where our heap is. This is the fundamental
-  ~ trick that makes the heap usable.
+  ~ "hardcoding" can reflect where our log is. This is the fundamental
+  ~ trick that makes the log usable.
   ~
   ~   One more thing to notice: We already allocated the backing stores of
-  ~ these variables, and populated their initial values, in _start. The
+  ~ these variables, and populated their initial values, in cold-start. The
   ~ words we're defining return those same addresses for the same backing
   ~ stores. So, we have continuity: Stuff defined in terms of the
   ~ variable-words we're defining now will interoperate with the stuff that
-  ~ we define in the "early" way, which includes those very words. Both the
-  ~ early code and the later code are dealing with the same data structures,
-  ~ they're just using a different technique to find them.
+  ~ we define using the log-load helpers, which includes those very words.
+  ~ Both the log-load code and the later code are dealing with the same data
+  ~ structures, they're just using a different technique to find them.
   ~
   ~   This is the only hardcoding we need to do; by building on top of it,
   ~ we will soon reach a point where the rest of the system can be defined
   ~ within itself.
-  ~ TODO These need to, like, exist first. Also they need to be referenced
-  ~ as labels.
-  ~ dq early_heap, litstring, "heap", early_variable
-  ~ dq early_s0, litstring, "s0", early_variable
-  ~ dq early_r0, litstring, "r0", early_variable
-  ~ dq early_latest, litstring, "latest", early_variable
-  ~ dq early_here, litstring, "here", early_variable
-  ;
 
+  L@' log-load-log offset-to-target-address-space pack64
+  L@' litstring offset-to-target-address-space pack64
+  s" log" packstring 8 packalign
+  L@' log-load-variable offset-to-target-address-space pack64
 
-~ (previous entry address, output point, name string pointer
-~  -- new entry address, output point)
-: output-create
-  3roll dup 4 roll swap pack64
-  ~ (string pointer, new entry address, output point)
-  0 pack8
-  0 pack8
-  roll3 packstring
-  ~ (new entry address, output point)
-  8 packalign
-  ;
+  L@' log-load-s0 offset-to-target-address-space pack64
+  L@' litstring offset-to-target-address-space pack64
+  s" s0" packstring 8 packalign
+  L@' log-load-variable offset-to-target-address-space pack64
 
+  L@' log-load-r0 offset-to-target-address-space pack64
+  L@' litstring offset-to-target-address-space pack64
+  s" r0" packstring 8 packalign
+  L@' log-load-variable offset-to-target-address-space pack64
 
-~ Routine docol
-~ ~~~~~~~~~~~~~
-~
-~   Reference this via its label as the codeword of a word to make it an
-~ "interpreted" word. Concretely, it saves rsi (the "instruction pointer")
-~ to the control stack, takes the address of the codeword from rax and
-~ increments it in-place to form the new instruction pointer, and copies
-~ that to rsi.
-~
-~   Having then done this, we're now in the state that normal execution
-~ expects, so docol ends by it using "next" to begin the callee's execution,
-~ kicking off a nested call.
-~
-~   The name is said to be short for "do colon", because Forth high-level
-~ code begins word definitions with a colon.
-~
-~ Registers in:
-~
-~ * rsi is the caller's instruction pointer
-~ * rbp is the control stack pointer
-~ * rax is the address of the callee's codeword
-~
-~ Registers out:
-~
-~ * rsi is the callee's instruction pointer
-~ * rbp is the control stack pointer
-~
-~ (previous entry address, output point)
-: output-docol
-  s" docol" output-create
+  L@' log-load-latest offset-to-target-address-space pack64
+  L@' litstring offset-to-target-address-space pack64
+  s" latest" packstring 8 packalign
+  L@' log-load-variable offset-to-target-address-space pack64
 
-  ~ Evaluated as a word, docol is a constant which returns a pointer.
-  L@' docol :rax mov-reg64-imm64
-  :rax push-reg64
-  pack-next
-  8 packalign
+  L@' log-load-here offset-to-target-address-space pack64
+  L@' litstring offset-to-target-address-space pack64
+  s" here" packstring 8 packalign
+  L@' log-load-variable offset-to-target-address-space pack64
 
-  ~ Since docol is not a normal word, the label points to the value we care
-  ~ about from the assembly side of things, which is the address we use as the
-  ~ codeword.
-  current-offset L!' docol
-  :rsi pack-pushcontrol
-  8 :rax add-reg64-imm8
-  :rax :rsi mov-reg64-reg64
-  pack-next
-  8 packalign
-  ;
+  ~   Having done that, nothing else needs to be defined in an unusual way, so
+  ~ we can go ahead and dispatch to the log-load transform, and do the rest of
+  ~ the code through that.
+
+  ~ (input string pointer, output buffer start, output point)
+  3roll
+  log-load-transform ;
 
-~   This is the mechanism to "return" from a word interpreted by docol.
-~ We pop the control stack, and then, since this is threaded execution, we
-~ do the next thing the caller wants to do, by inlining "next".
-~
-~   This word would work fine with the label transformation, so we could put
-~ it in core.e, but we choose to define it here because it's easier to
-~ understand when it's close to the rest of the execution stuff.
-~
-~ (previous entry address, output point
-~  -- new entry address, output point)
-: output-exit
-  s" exit" output-create
-  current-offset L!' exit
-  :rsi pack-popcontrol
-  pack-next
-  ;
+
+~ Where next?
+~ ~~~~~~~~~~~
+~
+~   The definitions of "docol" and "exit" are very tightly bound up with the
+~ execution model. They're defined and explained in core.e, because they need
+~ to be part of the build process in two different ways, like the other core
+~ functionality.
 
diff --git a/labels.e b/labels.e
index 8716364..253488e 100644
--- a/labels.e
+++ b/labels.e
@@ -288,7 +288,7 @@
   0 swap
   ~ TODO every time you double this to fix a crash, you must publicly
   ~ apologize for deferring a real fix. those are the rules
-  0x2000 allocate dup
+  0x4000 allocate dup
   ~ (iteration count, execution token, output start, output point)
   { 3 pick 100 > }
   { 2 pick execute 4 roll 1+ 4 unroll
diff --git a/log-load.e b/log-load.e
index 1339d7b..71324f2 100644
--- a/log-load.e
+++ b/log-load.e
@@ -151,10 +151,19 @@
 ~ TODO: remove one of them. Probably the other one.
 : log-offset                        0x10000 ; ~ 64 KiB
 
+~ (log address -- log address, "log" pointer)
+: log-load-log
+  dup log-offset + ;
+~ (log address -- log address, "s0" pointer)
+: log-load-s0
+  dup log-offset + 8 + ;
+~ (log address -- log address, "r0" pointer)
+: log-load-r0
+  dup log-offset + 2 8 * + ;
 ~ (log address -- log address, "latest" pointer)
 : log-load-latest
   dup log-offset + 3 8 * + ;
-~ (log address -- log address, "latest" pointer)
+~ (log address -- log address, "here" pointer)
 : log-load-here
   dup log-offset + 4 8 * + ;
 
@@ -236,3 +245,31 @@
   ~ (log address, updated here value, here)
   ! ;
 
+
+~   This is the same as "variable", from interpret.e, except that it takes the
+~ log's address as a parameter rather than hardcoding it, so that it can be
+~ used in situations where the normal compilation process isn't yet available.
+~
+~ (log address, address for new variable word, string pointer -- log address)
+: log-load-variable
+  3roll swap log-load-create
+  ~ (address for new variable word, log address)
+
+  log-load-here 3unroll
+  ~ (log address, address for new variable word, here)
+
+  dup @
+  ~ (log address, address for new variable word, here, output point)
+  dup 8 + pack64
+
+  3roll
+  :rax
+  mov-reg64-imm64
+  ~ (log address, here, output point)
+
+~   :rax push-reg64
+  pack-next
+  8 packalign
+
+  swap ! ;
+
diff --git a/transform.e b/transform.e
index 35ccc9b..7f3d9ef 100644
--- a/transform.e
+++ b/transform.e
@@ -96,6 +96,7 @@
 ~ their runtime addresses, though it is otherwise allowed to modify and rely
 ~ on them in all the usual ways. The alternate versions are defined in this
 ~ file as their own words, "Lcreate", "L:", "L;", and "L;asm".
+~ TODO note L@' and L!'
 ~
 ~   Note that these alternates are applied via a purely lexical
 ~ transformation: when a word would be looked up in the dictionary to
@@ -240,6 +241,7 @@
       { drop zero-input-buffer-metadata } if-else ;
 
 
+~ TODO rename this to transformation-state
 : transform-state-saved-here ;
 : transform-state-saved-latest 8 + ;
 : transform-state-output-buffer-start 2 8 * + ;
@@ -307,6 +309,15 @@ allocate-transform-state s" transform-state" variable
   target-address-space-to-offset offset-to-host-address-space ;
 
 
+: describe-transformation
+  ."    active here " here @ .hex64 space ." latest " latest @ .hex64 newline
+  ."     saved here " transform-state transform-state-saved-here
+  @ .hex64 space
+  ." latest " transform-state transform-state-saved-latest @ .hex64 newline
+  ."   output start " transform-state transform-state-output-buffer-start
+  @ .hex64 newline ;
+
+
 ~ Label transform implementation
 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~
@@ -355,7 +366,7 @@ allocate-transform-state s" transform-state" variable
 
   ~ This looks up "docol" by label.
   swap-transform-variables
-  L@' docol
+  L@' docol-codeword-value
   L@' origin
   swap-transform-variables
   + ,
@@ -399,6 +410,47 @@ allocate-transform-state s" transform-state" variable
   ; make-immediate
 
 
+~   Because docol requires it, we provide a special mini-version of the label
+~ system. We only do L@' and L!', because that's all we need. These are real
+~ labels; there can be arbitrarily many of them, and they can have forward
+~ references.
+~
+~   The value that's accepted is in the host address space; the label is set
+~ to an offset; and the value that's returned is in the target address space.
+~
+~ (-- value)
+: label-L@'-alternate
+  word value@
+
+  swap-transform-variables
+  intern-label
+  use-label
+  swap-transform-variables
+
+  dropstring-with-result
+
+  offset-to-target-address-space
+  ; make-immediate
+
+
+~ (value --)
+: label-L!'-alternate
+  host-address-space-to-offset
+
+  word value@
+
+  swap-transform-variables
+  intern-label
+  swap-transform-variables
+
+  dropstring-with-result
+
+  swap-transform-variables
+  set-label
+  swap-transform-variables
+  ; make-immediate
+
+
 ~   This implements the label transform for a single word. It is directly
 ~ analogous to "interpret", and reading interpret.e may help in understanding
 ~ it, though it's meant to still make sense on its own.
@@ -428,6 +480,8 @@ allocate-transform-state s" transform-state" variable
   dup s" :" stringcmp 0 = { swap drop ' L: swap } if
   dup s" ;" stringcmp 0 = { swap drop ' L; swap } if
   dup s" ;asm" stringcmp 0 = { swap drop ' L;asm swap } if
+  dup s" L@'" stringcmp 0 = { swap drop ' label-L@'-alternate swap } if
+  dup s" L!'" stringcmp 0 = { swap drop ' label-L!'-alternate swap } if
   drop swap
   ~ (name as stack string, 0 or alternate entry pointer, name pointer)
 
@@ -495,25 +549,61 @@ allocate-transform-state s" transform-state" variable
         ~ and clean up.
         offset-to-target-address-space , drop dropstring 0 exit
       } if
-    } if
-  } if
-  ~ (name as stack string, immediate entry pointer, name pointer)
 
-  ~   If we got here, one of three things is true: We're in interpret mode;
-  ~ the word is immediate; or no word was found. If the immediate entry
-  ~ pointer is non-zero, run it.
-  over {
-    drop dropstring-with-result entry-to-execution-token execute
-    0 exit
-  } if
+      ~   If we got here, we're in compile mode, no label was found, and even
+      ~ if there was a candidate for an immediate word it wasn't flagged as
+      ~ immediate. There are two possibilities: It's genuinely missing, or it's
+      ~ an integer literal. We decline to run the candidate immediate entry,
+      ~ even if it exists, because that's not the correct semantics.
+      ~
+      ~   If the word is genuinely missing, we want to make sure we make it
+      ~ all the way to the not-found error-handling code at the end, because
+      ~ that will be way easier to debug than doing the wrong thing will. Way,
+      ~ way easier. Far less staring at numbers.
+      ~
+      ~   Anyway, we no longer need the immediate entry pointer, so we drop
+      ~ it.
+      drop drop
+    } {
+      ~   If we get here, we're in compile mode, but there was a candidate
+      ~ entry for an immediate word, and it was indeed flagged as immediate.
+      ~ So, we run it and exit.
+      drop dropstring-with-result entry-to-execution-token execute
+      0 exit
+    } if-else
+
+    ~   This is the end of the compile-mode branch. As you can see by tracing
+    ~ through all the above cases, if we got here, the two possibilities are
+    ~ that the word is genuinely missing, or it's an integer literal.
+    ~
+    ~   Please notice that these are the same two possibilities remaining at
+    ~ the end of the immediate-mode branch, below.
+  } {
+    ~   If we got here, we're in interpret mode. There are three
+    ~ possibilities: there's an immediate word which we should run; it's an
+    ~ integer literal; or the word is genuinely missing.
+    ~
+    ~   If the immediate entry pointer is non-zero, run it and exit.
+    over {
+      drop dropstring-with-result entry-to-execution-token execute
+      0 exit
+    } if
 
-  ~   If we're still here, it wasn't in the dictionary. Also, we don't need
-  ~ the immediate entry pointer, either.
-  drop drop
+    ~   There was no immediate word, so either it's an integer literal or
+    ~ the word is genuinely missing. Please notice that these are the same two
+    ~ possibilities remaining at the end of the compile-mode branch, above.
+    ~
+    ~   We no longer need the immediate-mode pointer, so drop it.
+    drop drop
+  } if-else
   ~ (name as stack string)
 
-  ~   If it's not in the dictionary, check whether it's an integer literal. As
-  ~ before, we get the stack address and use it as a string pointer.
+  ~   If we got here, one of two things is true: the word is an integer
+  ~ literal, or it's genuinely missing. We know this because both the mode
+  ~ cases above end with these as the only two remaining possibilities. So
+  ~
+  ~   Check whether it's an integer literal. As before, we get the stack
+  ~ address and use it as a string pointer.
   value@ read-integer 0 = {
     ~ It's a number.
     interpreter-flags @ 0x01 & {
@@ -536,8 +626,13 @@ allocate-transform-state s" transform-state" variable
     0 exit
   } if
 
-  ~ If it's neither in the dictionary nor a number, just print an error.
-  s" No such word: " emitstring value@ emitstring dropstring 0 ;
+  ~   If it's neither in the dictionary nor a number, just print an error.
+  ~
+  ~   It's really important, when maintaining this code, to make sure that all
+  ~ the possible ways the word can fail to exist, end up here. Doing anything
+  ~ else is going to result in many hours of trying to untangle the
+  ~ consequences of incorrect behavior, after-the-fact.
+  s" No such word: " emitstring value@ emitstring newline dropstring 0 ;
 
 
 ~   This implements the label transform for all words in a region given as an
@@ -623,17 +718,31 @@ allocate-transform-state s" transform-state" variable
 ~ to be extremely useful to read and understand ":" in interpret.e before
 ~ attempting to understand "log-load-colon-alternate".
 : log-load-colon-alternate
-  ~ ~ This calls "log-load-create" instead of "create".
+  ~ This calls "log-load-create" instead of "create".
   word value@ log-load-create-alternate dropstring
 
-  ~ This looks up "docol" by label.
-  ~ swap-transform-variables
-  ~ L@' docol
-  ~ L@' origin
-  ~ swap-transform-variables
-  ~ + ,
+  ~   We generate code that looks up "docol" by name, runs it to get the
+  ~ codeword pointer, then finally appends it to the entry.
+  swap-transform-variables
+  ~ As usual, we do these in reverse.
+  L@' log-load-comma
+  L@' execute
+  L@' log-load-find-execution-token
+  L@' litstring
+  swap-transform-variables
 
-  ~ TODO note no hiding the entry
+  offset-to-target-address-space ,     ~ litstring
+  here @ s" docol" packstring 8 packalign here !
+  offset-to-target-address-space ,     ~ log-load-find-execution-token
+  offset-to-target-address-space ,     ~ execute
+  offset-to-target-address-space ,     ~ log-load-comma
+
+  ~   This is where we would mark the entry hidden, but we don't do that. It
+  ~ won't shadow anything and it won't be called until the entire log-load
+  ~ routine has finished.
+
+  ~   Switching between immediate and compile mode is one of the very few
+  ~ things that happens NOW, while the log-load transform is actually running.
   ]
   ;
 
@@ -643,16 +752,24 @@ allocate-transform-state s" transform-state" variable
 ~ likely to be extremely useful to read and understand ";" in interpret.e
 ~ before attempting to understand "log-load-semicolon-alternate".
 : log-load-semicolon-alternate
-  ~ ~ This looks up "exit" by label.
-  ~ swap-transform-variables
-  ~ L@' exit
-  ~ swap-transform-variables
-  ~ offset-to-target-address-space ,
+  ~   We generate code that looks up "exit" by name and appends it to the
+  ~ entry.
+  swap-transform-variables
+  ~ As usual, we do these in reverse.
+  L@' log-load-comma
+  L@' log-load-find-execution-token
+  L@' litstring
+  swap-transform-variables
 
-  ~ latest @ unhide-entry
+  offset-to-target-address-space ,     ~ litstring
+  here @ s" exit" packstring 8 packalign here !
+  offset-to-target-address-space ,     ~ log-load-find-execution-token
+  offset-to-target-address-space ,     ~ log-load-comma
 
-  ~ ~   Since [ is an immediate word, we have to go to extra trouble to compile
-  ~ ~ it as part of ;.
+  ~ This is where we would unhide the entry, but again, we don't do that.
+
+  ~   Since [ is an immediate word, we have to go to extra trouble to compile
+  ~ it as part of ;.
   [ ' [ entry-to-execution-token , ]
   ; make-immediate
 
@@ -673,6 +790,60 @@ allocate-transform-state s" transform-state" variable
   ~ [ ' [ entry-to-execution-token , ]
   ; make-immediate
 
+
+~   Because docol requires it, we provide a special mini-version of the label
+~ system. We only do L@' and L!', because that's all we need. Unlike the
+~ version of this feature for the label transform, for the log-load transform,
+~ we heavily restrict the use-case.
+~
+~   The implementation strategy is that we ignore the label name, and store
+~ the value on the stack when the generated log-load routine runs. So, each
+~ instance of L@' must be closely followed by a matching instance of L!'. Each
+~ label can only ever be used exactly once, and it must be a backward
+~ reference. Furthermore, there is a very tight restriction on what can be
+~ on the stack. The easiest way to explain it is by showing the interface of
+~ these words from the transformed code's perspective:
+~
+~   L!' is (preserved value, value of label
+~           -- value of label, preserved value)
+~   L@' is (value of label, preserved value
+~           -- preserved value, value of label)
+~
+~   The preserved value is simply another item on the stack, which the label
+~ takes pains not to interfere with.
+~
+~   There is no adjustment done on the saved value, since it's created in the
+~ target address space and then also used in the target address space. It
+~ wouldn't actually be necessary to use this at all, since checking "here"
+~ would be sufficient, but then the code would have to do something different
+~ depending on which transform it's running under, and there'd have to be a
+~ mechanism for that.
+~
+~   If that sounds super complex: All we actually do is read a label name,
+~ ignore it, and output a call to swap.
+~
+~   This is sufficient to implement docol, and that's probably the only thing
+~ it should be used for.
+: log-load-L@'-alternate
+  word dropstring
+
+  swap-transform-variables
+  L@' swap
+  swap-transform-variables
+
+  offset-to-target-address-space ,     ~ swap
+  ; make-immediate
+
+: log-load-L!'-alternate
+  word dropstring
+
+  swap-transform-variables
+  L@' swap
+  swap-transform-variables
+
+  offset-to-target-address-space ,     ~ swap
+  ; make-immediate
+
 ~   This implements the log-load transform for a single word. It is directly
 ~ analogous to "interpret", and reading interpret.e may help in understanding
 ~ it, though it's meant to still make sense on its own.
@@ -706,6 +877,8 @@ allocate-transform-state s" transform-state" variable
     swap drop ' log-load-semicolon-alternate swap } if
   dup s" ;asm" stringcmp 0 = {
     swap drop ' log-load-semicolon-assembly-alternate swap } if
+  dup s" L@'" stringcmp 0 = { swap drop ' log-load-L@'-alternate swap } if
+  dup s" L!'" stringcmp 0 = { swap drop ' log-load-L!'-alternate swap } if
   drop
   ~ (name as stack string, 0 or alternate entry pointer)
 
@@ -787,10 +960,46 @@ allocate-transform-state s" transform-state" variable
   ~ (name as stack string)
 
   ~   We're in immediate mode. We compile code that runs the word immediately.
-  ~ We check whether there's a label for the word; if there is, we output
-  ~ that. Otherwise we output code that looks it up and runs it.
-  ~ TODO
+  ~ We check whether there's a label for the word; if there is, we'll output
+  ~ that. Otherwise we'll output code that looks it up in the log and runs it.
+  ~
+  ~   Just like in label-transform, we use find-label to check whether a label
+  ~ exists without declaring a dependency on it, then if it does, we do
+  ~ use-label to ask for its value.
+  ~
+  ~   There's one additional wrinkle to remember here: We're running inside
+  ~ the label loop, and warm-start appears before all the normal words in the
+  ~ executable. So all the labels we'll be checking are forwared references,
+  ~ and on the very first pass they definitely won't be defined. That's fine
+  ~ though, they will exist on all subsequent passes, so things will
+  ~ definitely still converge.
+  ~
+  ~   The first pass will never accidentally think it succeeded, because even
+  ~ the reference to L' cold-start from the ELF header is a forward reference
+  ~ and won't exist on the first pass.
+  value@
+  swap-transform-variables
+  find-label
+  swap-transform-variables
+  {
+    ~   Again just like in label-transform, we declare our use of the label
+    ~ and get a value for it.
+    value@
+    swap-transform-variables
+    intern-label use-label
+    swap-transform-variables
+
+    ~   Like in label-transform, this is a codeword pointer, so we just output
+    ~ it directly. Also as before, because we don't have to examine it, we
+    ~ don't have to do anything special in the case where it's zero due to the
+    ~ way the label loop works.
+    offset-to-target-address-space , dropstring 0 exit
+    dropstring 0 exit
+  } if
 
+  ~   There's no label for the word; that means it wasn't statically
+  ~ compiled-in to the target executable. So we output code that looks up the
+  ~ word by name on the log, then calls it.
   value@
   swap-transform-variables
   ~ This is reverse order again.
author	Irene Knapp <ireneista@irenes.space>	2026-05-18 16:36:08 -0700
committer	Irene Knapp <ireneista@irenes.space>	2026-05-18 16:36:08 -0700
commit	eec336dea3d86e176c4bd86c435e6be35fec64e2 (patch)
tree	9d72ffb1ab138e693ef8de7cbfec58c10d04d412
parent	5820c29532ba877ac50b331658a4b4dd7f138ded (diff)