From eec336dea3d86e176c4bd86c435e6be35fec64e2 Mon Sep 17 00:00:00 2001 From: Irene Knapp Date: Mon, 18 May 2026 16:36:08 -0700 Subject: okay there's a strat for making the here/latest variables now it doesn't work yet, but it's gonna the reason it doesn't work is that the new helper log-load-variable relies on the assembly-definition words being statically available, and they aren't yet that's fine though, this is still a huge change, worth checking in. why? well, it represents like 16 hours of debugging which culminated in some very minor changes to the semantics of the label transform, in order to make missing words easier to notice and debug. see comments for details. woooooo :D Force-Push: yes Change-Id: Id8334819d165ba9e3156ef2bf32008af748eac29 --- core.e | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 86 insertions(+), 2 deletions(-) (limited to 'core.e') diff --git a/core.e b/core.e index b2062d1..0390812 100644 --- a/core.e +++ b/core.e @@ -8,8 +8,88 @@ ~ of it is later copied into the log when that executable runs. Therefore, it ~ is written to obey the constraints of both the label transform, and the ~ log-load transform; see transform.e for more details on that. + + +~ Execution support +~ ~~~~~~~~~~~~~~~~~ +~ +~ There's two words, docol and exit, which are essential parts of the +~ indirect-threaded execution model. It would be tempting to put them in +~ execution.e, so they'd be closer to the explanation of what they do, but +~ we need two copies of them (just like we do of every other word in this +~ file), one statically compiled and one in the log. So, they're here, because +~ that's significantly simpler, even though it creates a little bit of extra +~ work for the label transform. + + +~ Docol is the "interpreter" that is responsible for the semantics of words +~ written as Forth high-level code. The name is said to be short for +~ "do colon", because word definitions begin with a colon. +~ +~ Concretely, when interpreting, it saves rsi (the "instruction pointer") to +~ the control stack, takes the address of the codeword from rax and increments +~ it in-place to form the new instruction pointer, and copies that to rsi. +~ +~ Having then done this, we're now in the state that normal execution +~ expects, so docol ends by it using "next" to begin the callee's execution, +~ kicking off a nested call. ~ +~ At runtime, invoke docol directly and it will return the value that should +~ be used as a codeword. When compiling to a binary executable, the +~ transformation facility needs to reference it directly, and for that +~ purpose, the label "docol-codeword-value" points to the correct place. +~ +~ Registers in: +~ +~ * rsi is the caller's instruction pointer +~ * rbp is the control stack pointer +~ * rax is the address of the callee's codeword +~ +~ Registers out: +~ +~ * rsi is the callee's instruction pointer +~ * rbp is the control stack pointer + +~ We need this one snippet of assembly code that's just on the log raw, +~ without a word header, because of its role in making words work. We define +~ a label pointing to it. ~ +~ This is the only use of labels in core.e. There's no easy way to +~ remove it. We can't use immediate computation based on "here" because +~ the label transform uses the host address space, not the target address +~ space. The transformation facility's support for labels in transformed +~ code was added just for this. +~ +~ It may seem as if we could have used some extra space in the middle of +~ the proper docol word, which is defined just below. That would work fine +~ with the label transform, which can do forward references, but the log-load +~ transform's label support is special-cased to ONLY do this, and it will only +~ work with a backward reference. +here @ +dup L!' docol-codeword-value +:rsi pack-pushcontrol +8 :rax add-reg64-imm8 +:rax :rsi mov-reg64-reg64 +pack-next +8 packalign +here ! + +: docol + [ here @ + ~ Evaluated as a word, docol is a constant which returns a pointer. + L@' docol-codeword-value :rax mov-reg64-imm64 + :rax push-reg64 + here ! ] ;asm + +~ This is the mechanism to "return" from a word interpreted by docol. +~ We pop the control stack, and then, since this is threaded execution, we +~ do the next thing the caller wants to do, by inlining "next". +: exit + [ here @ + :rsi pack-popcontrol + here ! ] ;asm + + ~ Stack manipulation routines ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ @@ -1063,8 +1143,12 @@ ~ (output point, alignment byte count -- output point) : packalign - { 2dup /% drop { drop exit } unless - swap 0 pack8 swap } forever ; + 2dup /% drop 0branch [ 8 8 * , ] + swap 0 pack8 swap branch [ -11 8 * , ] + drop ; + ~ TODO this is the implementation we could use if we had more flow-control + ~ { 2dup /% drop { drop exit } unless + ~ swap 0 pack8 swap } forever ; ~ Binary unpacking -- cgit 1.4.1