diff options
| -rw-r--r-- | core.e | 88 | ||||
| -rw-r--r-- | evoke.e | 26 | ||||
| -rw-r--r-- | execution.e | 143 | ||||
| -rw-r--r-- | labels.e | 2 | ||||
| -rw-r--r-- | log-load.e | 39 | ||||
| -rw-r--r-- | transform.e | 283 |
6 files changed, 441 insertions, 140 deletions
diff --git a/core.e b/core.e index b2062d1..0390812 100644 --- a/core.e +++ b/core.e @@ -8,8 +8,88 @@ ~ of it is later copied into the log when that executable runs. Therefore, it ~ is written to obey the constraints of both the label transform, and the ~ log-load transform; see transform.e for more details on that. + + +~ Execution support +~ ~~~~~~~~~~~~~~~~~ +~ +~ There's two words, docol and exit, which are essential parts of the +~ indirect-threaded execution model. It would be tempting to put them in +~ execution.e, so they'd be closer to the explanation of what they do, but +~ we need two copies of them (just like we do of every other word in this +~ file), one statically compiled and one in the log. So, they're here, because +~ that's significantly simpler, even though it creates a little bit of extra +~ work for the label transform. + + +~ Docol is the "interpreter" that is responsible for the semantics of words +~ written as Forth high-level code. The name is said to be short for +~ "do colon", because word definitions begin with a colon. +~ +~ Concretely, when interpreting, it saves rsi (the "instruction pointer") to +~ the control stack, takes the address of the codeword from rax and increments +~ it in-place to form the new instruction pointer, and copies that to rsi. +~ +~ Having then done this, we're now in the state that normal execution +~ expects, so docol ends by it using "next" to begin the callee's execution, +~ kicking off a nested call. ~ +~ At runtime, invoke docol directly and it will return the value that should +~ be used as a codeword. When compiling to a binary executable, the +~ transformation facility needs to reference it directly, and for that +~ purpose, the label "docol-codeword-value" points to the correct place. +~ +~ Registers in: +~ +~ * rsi is the caller's instruction pointer +~ * rbp is the control stack pointer +~ * rax is the address of the callee's codeword +~ +~ Registers out: +~ +~ * rsi is the callee's instruction pointer +~ * rbp is the control stack pointer + +~ We need this one snippet of assembly code that's just on the log raw, +~ without a word header, because of its role in making words work. We define +~ a label pointing to it. ~ +~ This is the only use of labels in core.e. There's no easy way to +~ remove it. We can't use immediate computation based on "here" because +~ the label transform uses the host address space, not the target address +~ space. The transformation facility's support for labels in transformed +~ code was added just for this. +~ +~ It may seem as if we could have used some extra space in the middle of +~ the proper docol word, which is defined just below. That would work fine +~ with the label transform, which can do forward references, but the log-load +~ transform's label support is special-cased to ONLY do this, and it will only +~ work with a backward reference. +here @ +dup L!' docol-codeword-value +:rsi pack-pushcontrol +8 :rax add-reg64-imm8 +:rax :rsi mov-reg64-reg64 +pack-next +8 packalign +here ! + +: docol + [ here @ + ~ Evaluated as a word, docol is a constant which returns a pointer. + L@' docol-codeword-value :rax mov-reg64-imm64 + :rax push-reg64 + here ! ] ;asm + +~ This is the mechanism to "return" from a word interpreted by docol. +~ We pop the control stack, and then, since this is threaded execution, we +~ do the next thing the caller wants to do, by inlining "next". +: exit + [ here @ + :rsi pack-popcontrol + here ! ] ;asm + + ~ Stack manipulation routines ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ @@ -1063,8 +1143,12 @@ ~ (output point, alignment byte count -- output point) : packalign - { 2dup /% drop { drop exit } unless - swap 0 pack8 swap } forever ; + 2dup /% drop 0branch [ 8 8 * , ] + swap 0 pack8 swap branch [ -11 8 * , ] + drop ; + ~ TODO this is the implementation we could use if we had more flow-control + ~ { 2dup /% drop { drop exit } unless + ~ swap 0 pack8 swap } forever ; ~ Binary unpacking diff --git a/evoke.e b/evoke.e index 607242d..869de31 100644 --- a/evoke.e +++ b/evoke.e @@ -7,8 +7,30 @@ s" source-to-precompile" variable +~ : fooze 4 . ; fooze 1024 read-to-buffer -: fooze 4 . ; +here @ +dup L!' docol-codeword-value +:rsi pack-pushcontrol +8 :rax add-reg64-imm8 +:rax :rsi mov-reg64-reg64 +pack-next +8 packalign +here ! + +: docol + [ here @ + L@' docol-codeword-value :rax mov-reg64-imm64 + :rax push-reg64 + here ! ] ;asm + +: exit + [ here @ + :rsi pack-popcontrol + here ! ] ;asm + +: foo 1 2 + ; +0 sys-exit pyrzqxgl s" source-to-copy-to-log" variable @@ -26,8 +48,6 @@ s" source-to-copy-to-log" variable elf-program-header output-cold-start source-to-copy-to-log output-warm-start - output-docol - output-exit source-to-precompile label-transform 0 L!' final-word-name current-offset L!' total-size diff --git a/execution.e b/execution.e index daacddb..1b9e84d 100644 --- a/execution.e +++ b/execution.e @@ -484,16 +484,11 @@ 3unroll 8 packalign current-offset L!' warm-start - 3roll - - log-load-transform - - ~ TODO this is tied to the specific example in evoke - ~ L@' happy-path L@' origin + pack64 + ~ (input string pointer, output buffer start, output point) ~ Before handing off to us, cold-start pushed a single value onto the - ~ stack, a pointer to the beginning of the heap. Now, we load our entire - ~ Forth implementation onto that heap, beginning with the minimal set of + ~ stack, a pointer to the beginning of the log. Now, we load our entire + ~ Forth implementation onto that log, beginning with the minimal set of ~ words needed to define more words. We do this because we need variables as ~ infrastructure so we can eventually have dynamic definitions. ~ @@ -514,117 +509,73 @@ ~ That choice does mean we have the hard version of this bootstrapping ~ problem, and copying ourselves to the heap is how we solve it. ~ - ~ We do have the heap address right now, though that won't last. In case + ~ We do have the log address right now, though that won't last. In case ~ it's unclear why not: keeping it on the stack would require all future ~ references to walk the stack, and somehow know when they've reached the ~ bottom. The stack is a good place to keep things with clearly delimited ~ lifetimes and visibility, but when we want something to live for our ~ entire program and be easy to find from any code within it, we need to ~ do something else. Anyway, since we have the address, we can use it for - ~ the next little bit of setup. + ~ the next little bit of setup. We have a bunch of helper words, from + ~ log-load.e, which make this easier. ~ ~ The first few words we define are our variables, which hardcode the ~ addresses they will return - but since we're doing this at runtime, - ~ "hardcoding" can reflect where our heap is. This is the fundamental - ~ trick that makes the heap usable. + ~ "hardcoding" can reflect where our log is. This is the fundamental + ~ trick that makes the log usable. ~ ~ One more thing to notice: We already allocated the backing stores of - ~ these variables, and populated their initial values, in _start. The + ~ these variables, and populated their initial values, in cold-start. The ~ words we're defining return those same addresses for the same backing ~ stores. So, we have continuity: Stuff defined in terms of the ~ variable-words we're defining now will interoperate with the stuff that - ~ we define in the "early" way, which includes those very words. Both the - ~ early code and the later code are dealing with the same data structures, - ~ they're just using a different technique to find them. + ~ we define using the log-load helpers, which includes those very words. + ~ Both the log-load code and the later code are dealing with the same data + ~ structures, they're just using a different technique to find them. ~ ~ This is the only hardcoding we need to do; by building on top of it, ~ we will soon reach a point where the rest of the system can be defined ~ within itself. - ~ TODO These need to, like, exist first. Also they need to be referenced - ~ as labels. - ~ dq early_heap, litstring, "heap", early_variable - ~ dq early_s0, litstring, "s0", early_variable - ~ dq early_r0, litstring, "r0", early_variable - ~ dq early_latest, litstring, "latest", early_variable - ~ dq early_here, litstring, "here", early_variable - ; + L@' log-load-log offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" log" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 -~ (previous entry address, output point, name string pointer -~ -- new entry address, output point) -: output-create - 3roll dup 4 roll swap pack64 - ~ (string pointer, new entry address, output point) - 0 pack8 - 0 pack8 - roll3 packstring - ~ (new entry address, output point) - 8 packalign - ; + L@' log-load-s0 offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" s0" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 + L@' log-load-r0 offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" r0" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 -~ Routine docol -~ ~~~~~~~~~~~~~ -~ -~ Reference this via its label as the codeword of a word to make it an -~ "interpreted" word. Concretely, it saves rsi (the "instruction pointer") -~ to the control stack, takes the address of the codeword from rax and -~ increments it in-place to form the new instruction pointer, and copies -~ that to rsi. -~ -~ Having then done this, we're now in the state that normal execution -~ expects, so docol ends by it using "next" to begin the callee's execution, -~ kicking off a nested call. -~ -~ The name is said to be short for "do colon", because Forth high-level -~ code begins word definitions with a colon. -~ -~ Registers in: -~ -~ * rsi is the caller's instruction pointer -~ * rbp is the control stack pointer -~ * rax is the address of the callee's codeword -~ -~ Registers out: -~ -~ * rsi is the callee's instruction pointer -~ * rbp is the control stack pointer -~ -~ (previous entry address, output point) -: output-docol - s" docol" output-create + L@' log-load-latest offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" latest" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 - ~ Evaluated as a word, docol is a constant which returns a pointer. - L@' docol :rax mov-reg64-imm64 - :rax push-reg64 - pack-next - 8 packalign + L@' log-load-here offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" here" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 - ~ Since docol is not a normal word, the label points to the value we care - ~ about from the assembly side of things, which is the address we use as the - ~ codeword. - current-offset L!' docol - :rsi pack-pushcontrol - 8 :rax add-reg64-imm8 - :rax :rsi mov-reg64-reg64 - pack-next - 8 packalign - ; + ~ Having done that, nothing else needs to be defined in an unusual way, so + ~ we can go ahead and dispatch to the log-load transform, and do the rest of + ~ the code through that. + + ~ (input string pointer, output buffer start, output point) + 3roll + log-load-transform ; -~ This is the mechanism to "return" from a word interpreted by docol. -~ We pop the control stack, and then, since this is threaded execution, we -~ do the next thing the caller wants to do, by inlining "next". -~ -~ This word would work fine with the label transformation, so we could put -~ it in core.e, but we choose to define it here because it's easier to -~ understand when it's close to the rest of the execution stuff. -~ -~ (previous entry address, output point -~ -- new entry address, output point) -: output-exit - s" exit" output-create - current-offset L!' exit - :rsi pack-popcontrol - pack-next - ; + +~ Where next? +~ ~~~~~~~~~~~ +~ +~ The definitions of "docol" and "exit" are very tightly bound up with the +~ execution model. They're defined and explained in core.e, because they need +~ to be part of the build process in two different ways, like the other core +~ functionality. diff --git a/labels.e b/labels.e index 8716364..253488e 100644 --- a/labels.e +++ b/labels.e @@ -288,7 +288,7 @@ 0 swap ~ TODO every time you double this to fix a crash, you must publicly ~ apologize for deferring a real fix. those are the rules - 0x2000 allocate dup + 0x4000 allocate dup ~ (iteration count, execution token, output start, output point) { 3 pick 100 > } { 2 pick execute 4 roll 1+ 4 unroll diff --git a/log-load.e b/log-load.e index 1339d7b..71324f2 100644 --- a/log-load.e +++ b/log-load.e @@ -151,10 +151,19 @@ ~ TODO: remove one of them. Probably the other one. : log-offset 0x10000 ; ~ 64 KiB +~ (log address -- log address, "log" pointer) +: log-load-log + dup log-offset + ; +~ (log address -- log address, "s0" pointer) +: log-load-s0 + dup log-offset + 8 + ; +~ (log address -- log address, "r0" pointer) +: log-load-r0 + dup log-offset + 2 8 * + ; ~ (log address -- log address, "latest" pointer) : log-load-latest dup log-offset + 3 8 * + ; -~ (log address -- log address, "latest" pointer) +~ (log address -- log address, "here" pointer) : log-load-here dup log-offset + 4 8 * + ; @@ -236,3 +245,31 @@ ~ (log address, updated here value, here) ! ; + +~ This is the same as "variable", from interpret.e, except that it takes the +~ log's address as a parameter rather than hardcoding it, so that it can be +~ used in situations where the normal compilation process isn't yet available. +~ +~ (log address, address for new variable word, string pointer -- log address) +: log-load-variable + 3roll swap log-load-create + ~ (address for new variable word, log address) + + log-load-here 3unroll + ~ (log address, address for new variable word, here) + + dup @ + ~ (log address, address for new variable word, here, output point) + dup 8 + pack64 + + 3roll + :rax + mov-reg64-imm64 + ~ (log address, here, output point) + +~ :rax push-reg64 + pack-next + 8 packalign + + swap ! ; + diff --git a/transform.e b/transform.e index 35ccc9b..7f3d9ef 100644 --- a/transform.e +++ b/transform.e @@ -96,6 +96,7 @@ ~ their runtime addresses, though it is otherwise allowed to modify and rely ~ on them in all the usual ways. The alternate versions are defined in this ~ file as their own words, "Lcreate", "L:", "L;", and "L;asm". +~ TODO note L@' and L!' ~ ~ Note that these alternates are applied via a purely lexical ~ transformation: when a word would be looked up in the dictionary to @@ -240,6 +241,7 @@ { drop zero-input-buffer-metadata } if-else ; +~ TODO rename this to transformation-state : transform-state-saved-here ; : transform-state-saved-latest 8 + ; : transform-state-output-buffer-start 2 8 * + ; @@ -307,6 +309,15 @@ allocate-transform-state s" transform-state" variable target-address-space-to-offset offset-to-host-address-space ; +: describe-transformation + ." active here " here @ .hex64 space ." latest " latest @ .hex64 newline + ." saved here " transform-state transform-state-saved-here + @ .hex64 space + ." latest " transform-state transform-state-saved-latest @ .hex64 newline + ." output start " transform-state transform-state-output-buffer-start + @ .hex64 newline ; + + ~ Label transform implementation ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ @@ -355,7 +366,7 @@ allocate-transform-state s" transform-state" variable ~ This looks up "docol" by label. swap-transform-variables - L@' docol + L@' docol-codeword-value L@' origin swap-transform-variables + , @@ -399,6 +410,47 @@ allocate-transform-state s" transform-state" variable ; make-immediate +~ Because docol requires it, we provide a special mini-version of the label +~ system. We only do L@' and L!', because that's all we need. These are real +~ labels; there can be arbitrarily many of them, and they can have forward +~ references. +~ +~ The value that's accepted is in the host address space; the label is set +~ to an offset; and the value that's returned is in the target address space. +~ +~ (-- value) +: label-L@'-alternate + word value@ + + swap-transform-variables + intern-label + use-label + swap-transform-variables + + dropstring-with-result + + offset-to-target-address-space + ; make-immediate + + +~ (value --) +: label-L!'-alternate + host-address-space-to-offset + + word value@ + + swap-transform-variables + intern-label + swap-transform-variables + + dropstring-with-result + + swap-transform-variables + set-label + swap-transform-variables + ; make-immediate + + ~ This implements the label transform for a single word. It is directly ~ analogous to "interpret", and reading interpret.e may help in understanding ~ it, though it's meant to still make sense on its own. @@ -428,6 +480,8 @@ allocate-transform-state s" transform-state" variable dup s" :" stringcmp 0 = { swap drop ' L: swap } if dup s" ;" stringcmp 0 = { swap drop ' L; swap } if dup s" ;asm" stringcmp 0 = { swap drop ' L;asm swap } if + dup s" L@'" stringcmp 0 = { swap drop ' label-L@'-alternate swap } if + dup s" L!'" stringcmp 0 = { swap drop ' label-L!'-alternate swap } if drop swap ~ (name as stack string, 0 or alternate entry pointer, name pointer) @@ -495,25 +549,61 @@ allocate-transform-state s" transform-state" variable ~ and clean up. offset-to-target-address-space , drop dropstring 0 exit } if - } if - } if - ~ (name as stack string, immediate entry pointer, name pointer) - ~ If we got here, one of three things is true: We're in interpret mode; - ~ the word is immediate; or no word was found. If the immediate entry - ~ pointer is non-zero, run it. - over { - drop dropstring-with-result entry-to-execution-token execute - 0 exit - } if + ~ If we got here, we're in compile mode, no label was found, and even + ~ if there was a candidate for an immediate word it wasn't flagged as + ~ immediate. There are two possibilities: It's genuinely missing, or it's + ~ an integer literal. We decline to run the candidate immediate entry, + ~ even if it exists, because that's not the correct semantics. + ~ + ~ If the word is genuinely missing, we want to make sure we make it + ~ all the way to the not-found error-handling code at the end, because + ~ that will be way easier to debug than doing the wrong thing will. Way, + ~ way easier. Far less staring at numbers. + ~ + ~ Anyway, we no longer need the immediate entry pointer, so we drop + ~ it. + drop drop + } { + ~ If we get here, we're in compile mode, but there was a candidate + ~ entry for an immediate word, and it was indeed flagged as immediate. + ~ So, we run it and exit. + drop dropstring-with-result entry-to-execution-token execute + 0 exit + } if-else + + ~ This is the end of the compile-mode branch. As you can see by tracing + ~ through all the above cases, if we got here, the two possibilities are + ~ that the word is genuinely missing, or it's an integer literal. + ~ + ~ Please notice that these are the same two possibilities remaining at + ~ the end of the immediate-mode branch, below. + } { + ~ If we got here, we're in interpret mode. There are three + ~ possibilities: there's an immediate word which we should run; it's an + ~ integer literal; or the word is genuinely missing. + ~ + ~ If the immediate entry pointer is non-zero, run it and exit. + over { + drop dropstring-with-result entry-to-execution-token execute + 0 exit + } if - ~ If we're still here, it wasn't in the dictionary. Also, we don't need - ~ the immediate entry pointer, either. - drop drop + ~ There was no immediate word, so either it's an integer literal or + ~ the word is genuinely missing. Please notice that these are the same two + ~ possibilities remaining at the end of the compile-mode branch, above. + ~ + ~ We no longer need the immediate-mode pointer, so drop it. + drop drop + } if-else ~ (name as stack string) - ~ If it's not in the dictionary, check whether it's an integer literal. As - ~ before, we get the stack address and use it as a string pointer. + ~ If we got here, one of two things is true: the word is an integer + ~ literal, or it's genuinely missing. We know this because both the mode + ~ cases above end with these as the only two remaining possibilities. So + ~ + ~ Check whether it's an integer literal. As before, we get the stack + ~ address and use it as a string pointer. value@ read-integer 0 = { ~ It's a number. interpreter-flags @ 0x01 & { @@ -536,8 +626,13 @@ allocate-transform-state s" transform-state" variable 0 exit } if - ~ If it's neither in the dictionary nor a number, just print an error. - s" No such word: " emitstring value@ emitstring dropstring 0 ; + ~ If it's neither in the dictionary nor a number, just print an error. + ~ + ~ It's really important, when maintaining this code, to make sure that all + ~ the possible ways the word can fail to exist, end up here. Doing anything + ~ else is going to result in many hours of trying to untangle the + ~ consequences of incorrect behavior, after-the-fact. + s" No such word: " emitstring value@ emitstring newline dropstring 0 ; ~ This implements the label transform for all words in a region given as an @@ -623,17 +718,31 @@ allocate-transform-state s" transform-state" variable ~ to be extremely useful to read and understand ":" in interpret.e before ~ attempting to understand "log-load-colon-alternate". : log-load-colon-alternate - ~ ~ This calls "log-load-create" instead of "create". + ~ This calls "log-load-create" instead of "create". word value@ log-load-create-alternate dropstring - ~ This looks up "docol" by label. - ~ swap-transform-variables - ~ L@' docol - ~ L@' origin - ~ swap-transform-variables - ~ + , + ~ We generate code that looks up "docol" by name, runs it to get the + ~ codeword pointer, then finally appends it to the entry. + swap-transform-variables + ~ As usual, we do these in reverse. + L@' log-load-comma + L@' execute + L@' log-load-find-execution-token + L@' litstring + swap-transform-variables - ~ TODO note no hiding the entry + offset-to-target-address-space , ~ litstring + here @ s" docol" packstring 8 packalign here ! + offset-to-target-address-space , ~ log-load-find-execution-token + offset-to-target-address-space , ~ execute + offset-to-target-address-space , ~ log-load-comma + + ~ This is where we would mark the entry hidden, but we don't do that. It + ~ won't shadow anything and it won't be called until the entire log-load + ~ routine has finished. + + ~ Switching between immediate and compile mode is one of the very few + ~ things that happens NOW, while the log-load transform is actually running. ] ; @@ -643,16 +752,24 @@ allocate-transform-state s" transform-state" variable ~ likely to be extremely useful to read and understand ";" in interpret.e ~ before attempting to understand "log-load-semicolon-alternate". : log-load-semicolon-alternate - ~ ~ This looks up "exit" by label. - ~ swap-transform-variables - ~ L@' exit - ~ swap-transform-variables - ~ offset-to-target-address-space , + ~ We generate code that looks up "exit" by name and appends it to the + ~ entry. + swap-transform-variables + ~ As usual, we do these in reverse. + L@' log-load-comma + L@' log-load-find-execution-token + L@' litstring + swap-transform-variables - ~ latest @ unhide-entry + offset-to-target-address-space , ~ litstring + here @ s" exit" packstring 8 packalign here ! + offset-to-target-address-space , ~ log-load-find-execution-token + offset-to-target-address-space , ~ log-load-comma - ~ ~ Since [ is an immediate word, we have to go to extra trouble to compile - ~ ~ it as part of ;. + ~ This is where we would unhide the entry, but again, we don't do that. + + ~ Since [ is an immediate word, we have to go to extra trouble to compile + ~ it as part of ;. [ ' [ entry-to-execution-token , ] ; make-immediate @@ -673,6 +790,60 @@ allocate-transform-state s" transform-state" variable ~ [ ' [ entry-to-execution-token , ] ; make-immediate + +~ Because docol requires it, we provide a special mini-version of the label +~ system. We only do L@' and L!', because that's all we need. Unlike the +~ version of this feature for the label transform, for the log-load transform, +~ we heavily restrict the use-case. +~ +~ The implementation strategy is that we ignore the label name, and store +~ the value on the stack when the generated log-load routine runs. So, each +~ instance of L@' must be closely followed by a matching instance of L!'. Each +~ label can only ever be used exactly once, and it must be a backward +~ reference. Furthermore, there is a very tight restriction on what can be +~ on the stack. The easiest way to explain it is by showing the interface of +~ these words from the transformed code's perspective: +~ +~ L!' is (preserved value, value of label +~ -- value of label, preserved value) +~ L@' is (value of label, preserved value +~ -- preserved value, value of label) +~ +~ The preserved value is simply another item on the stack, which the label +~ takes pains not to interfere with. +~ +~ There is no adjustment done on the saved value, since it's created in the +~ target address space and then also used in the target address space. It +~ wouldn't actually be necessary to use this at all, since checking "here" +~ would be sufficient, but then the code would have to do something different +~ depending on which transform it's running under, and there'd have to be a +~ mechanism for that. +~ +~ If that sounds super complex: All we actually do is read a label name, +~ ignore it, and output a call to swap. +~ +~ This is sufficient to implement docol, and that's probably the only thing +~ it should be used for. +: log-load-L@'-alternate + word dropstring + + swap-transform-variables + L@' swap + swap-transform-variables + + offset-to-target-address-space , ~ swap + ; make-immediate + +: log-load-L!'-alternate + word dropstring + + swap-transform-variables + L@' swap + swap-transform-variables + + offset-to-target-address-space , ~ swap + ; make-immediate + ~ This implements the log-load transform for a single word. It is directly ~ analogous to "interpret", and reading interpret.e may help in understanding ~ it, though it's meant to still make sense on its own. @@ -706,6 +877,8 @@ allocate-transform-state s" transform-state" variable swap drop ' log-load-semicolon-alternate swap } if dup s" ;asm" stringcmp 0 = { swap drop ' log-load-semicolon-assembly-alternate swap } if + dup s" L@'" stringcmp 0 = { swap drop ' log-load-L@'-alternate swap } if + dup s" L!'" stringcmp 0 = { swap drop ' log-load-L!'-alternate swap } if drop ~ (name as stack string, 0 or alternate entry pointer) @@ -787,10 +960,46 @@ allocate-transform-state s" transform-state" variable ~ (name as stack string) ~ We're in immediate mode. We compile code that runs the word immediately. - ~ We check whether there's a label for the word; if there is, we output - ~ that. Otherwise we output code that looks it up and runs it. - ~ TODO + ~ We check whether there's a label for the word; if there is, we'll output + ~ that. Otherwise we'll output code that looks it up in the log and runs it. + ~ + ~ Just like in label-transform, we use find-label to check whether a label + ~ exists without declaring a dependency on it, then if it does, we do + ~ use-label to ask for its value. + ~ + ~ There's one additional wrinkle to remember here: We're running inside + ~ the label loop, and warm-start appears before all the normal words in the + ~ executable. So all the labels we'll be checking are forwared references, + ~ and on the very first pass they definitely won't be defined. That's fine + ~ though, they will exist on all subsequent passes, so things will + ~ definitely still converge. + ~ + ~ The first pass will never accidentally think it succeeded, because even + ~ the reference to L' cold-start from the ELF header is a forward reference + ~ and won't exist on the first pass. + value@ + swap-transform-variables + find-label + swap-transform-variables + { + ~ Again just like in label-transform, we declare our use of the label + ~ and get a value for it. + value@ + swap-transform-variables + intern-label use-label + swap-transform-variables + + ~ Like in label-transform, this is a codeword pointer, so we just output + ~ it directly. Also as before, because we don't have to examine it, we + ~ don't have to do anything special in the case where it's zero due to the + ~ way the label loop works. + offset-to-target-address-space , dropstring 0 exit + dropstring 0 exit + } if + ~ There's no label for the word; that means it wasn't statically + ~ compiled-in to the target executable. So we output code that looks up the + ~ word by name on the log, then calls it. value@ swap-transform-variables ~ This is reverse order again. |