diff options
| author | Irene Knapp <ireneista@irenes.space> | 2026-05-17 17:29:09 -0700 |
|---|---|---|
| committer | Irene Knapp <ireneista@irenes.space> | 2026-05-17 17:29:09 -0700 |
| commit | 3b41dbfa2338c11dd8398026c00922f20f32dc81 (patch) | |
| tree | 1761f2b8f2da972b830cf617d194c7c5be5b1535 /transform.e | |
| parent | 4f1a07da9c87a1560da34b8a96a9de4cdc90f1fc (diff) | |
implement log-load-create; add all the pack/unpack stuff to core.e
Force-Push: yes Change-Id: I04dd65a9eec71f9b50c8875bdcbe5d4be59888d5
Diffstat (limited to 'transform.e')
| -rw-r--r-- | transform.e | 211 |
1 files changed, 129 insertions, 82 deletions
diff --git a/transform.e b/transform.e index 5ddac78..42901f4 100644 --- a/transform.e +++ b/transform.e @@ -16,36 +16,51 @@ ~ specific way. The transforms rely on the label facility provided by ~ labels.e, and expect to run from within label-loop. ~ -~ The label transform operates on code that compiles itself, and ensures -~ that the result of the compilation is suitable to be included in an -~ executable binary as words that are statically referenced by their -~ addresses. To achieve this, it causes each newly-defined word to have a -~ corresponding label whose value is the offset of its codeword, and it causes -~ all compiled invocations of other words to be resolved by using these labels. -~ The label transform is suitable for code that must be directly invoked by -~ the warm-start routine provided by execution.e. -~ -~ The log-load transform also operates on code that compiles itself; it -~ produces a compiled routine which, when run, appends the original code to -~ the log. As the routine is run, each reference to another word is resolved -~ by looking up the name of the target word in the log. Furthermore, these -~ lookups are done using log-load-find, defined in log-load.e, which accepts -~ a pointer to the log's base address as a parameter. See that file for more -~ explanation of what the log is and why it's important. Thus, unlike normal -~ accesses to the log, this routine doesn't rely on already having the log's -~ base address hardcoded into it at the time of its own compilation. The -~ log-load transform is suitable for implementing the core responsibilities of -~ the warm-start routine provided by execution.e, relying on only a few -~ specific words that it statically references via labels. +~ The label transform produces code that uses one label per word it defines, +~ to statically reference everything. Thus, when output to an executable +~ binary, this code will function without external dependencies. The tradeoff +~ is that it has no way to reference data that exists only at runtime. +~ +~ The log-load transform relies on labels, but doesn't add any of its own. +~ It produces a compiled routine which, when run, dynamically looks up all the +~ references in the log, and appends the original code to the log. This adds +~ work that must be done when the runtime starts up, but the benefit is that +~ it can reference data that doesn't exist at compile-time. Most crucially, +~ it can reference the "here" and "latest" pointers in the log, which are +~ required for all the usual word-definition stuff to work. ~ ~ The log-load transform may also be useful for experimental tasks such as ~ creating additional, independent logs, or injecting Evocation into another ~ process's address space. ~ +~ Please notice that both these transforms, in different ways, navigate the +~ same underlying design tension: The Forth compilation model hardcodes +~ references at the time compilation happens, and Evocation makes the choice +~ to not decide the address of the log until runtime. Thus the label transform +~ can't be sufficient on its own. Other Forths avoid this problem by +~ hardcoding an address for the log, or by using OS-provided load-time +~ symbol relocation. Evocation, however, does it on hard mode, mostly for fun. +~ +~ Because it was clear from early on that the label transform couldn't stand +~ alone, and that another one would be necessary, we've refrained from adding +~ too many features to it. Since we have multiple transforms, they should each +~ be kept simple and well-defined, so that they can be composed in creative +~ new ways down the line. When adding additional behavior, always give thought +~ to whether it belongs in an existing transform or a new one. +~ ~ ~ About the label transform ~ ~~~~~~~~~~~~~~~~~~~~~~~~~ ~ +~ The label transform operates on code that compiles itself, and ensures +~ that the result of the compilation is suitable to be included in an +~ executable binary as words that are statically referenced by their +~ addresses. To achieve this, it causes each newly-defined word to have a +~ corresponding label whose value is the offset of its codeword, and it causes +~ all compiled invocations of other words to be resolved by using these labels. +~ The label transform is suitable for code that must be directly invoked by +~ the warm-start routine provided by execution.e. +~ ~ The most fundamental technique the label transform performs is to separate ~ words that run in compile mode from words that run immediately. There is no ~ distinction made between words running in immediate mode, and words declared @@ -105,6 +120,19 @@ ~ About the log-load transform ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ +~ The log-load transform also operates on code that compiles itself; it +~ produces a compiled routine which, when run, appends the original code to +~ the log. As the routine is run, each reference to another word is resolved +~ by looking up the name of the target word in the log. Furthermore, these +~ lookups are done using log-load-find, defined in log-load.e, which accepts +~ a pointer to the log's base address as a parameter. See that file for more +~ explanation of what the log is and why it's important. Thus, unlike normal +~ accesses to the log, this routine doesn't rely on already having the log's +~ base address hardcoded into it at the time of its own compilation. The +~ log-load transform is suitable for implementing the core responsibilities of +~ the warm-start routine provided by execution.e, relying on only a few +~ specific words that it statically references via labels. +~ ~ Much like the label transform, the log-load transform provides alternate ~ versions of certain immediate words used in word definition. Also like the ~ label transform, it provides its own copies of "here" and "latest". @@ -223,7 +251,7 @@ allocate-transform-state s" transform-state" variable ~ When calling the label facility during a transformation, it's necessary -~ to use the real, non-wrapped "heap" and "latest". +~ to use the real, non-wrapped "here" and "latest". : swap-transform-variables here @ transform-state transform-state-saved-here @ here ! transform-state transform-state-saved-here ! @@ -458,11 +486,11 @@ allocate-transform-state s" transform-state" variable ~ which is already what we want to output. ~ ~ An important caveat: Though it would require something weird to be - ~ happening, such as a forced forward reference, the label may be zero! - ~ We need to allow for that possibility by not examining the contents of - ~ a nonexistent entry. + ~ happening, such as a forced forward reference, the label may be + ~ zero! We need to allow for that possibility by not examining the + ~ contents of a nonexistent entry. ~ - ~ Fortunately we don't have to look at it, just append it to the heap + ~ Fortunately we don't have to look at it, just append it to the log ~ and clean up. offset-to-target-address-space , drop dropstring 0 exit } if @@ -489,7 +517,7 @@ allocate-transform-state s" transform-state" variable ~ It's a number. interpreter-flags @ 0x01 & { ~ We're in compile mode; append first "lit", then the number, to the - ~ heap. The version of "lit" we use is found by label, so it'll be the + ~ log. The version of "lit" we use is found by label, so it'll be the ~ one that exists when this code is ultimately run. dropstring-with-result @@ -572,30 +600,32 @@ allocate-transform-state s" transform-state" variable ~ below. It is likely to be extremely useful to read and understand "create" ~ in interpret.e before attempting to understand log-load-create. : log-load-create - dup stringlen 1 + dup 3unroll - here @ 10 + 3unroll memmove - here @ - - ~ This value of "latest" is going into the generated output, so we need - ~ to map it to the target address space. It's stored in the host address - ~ space to make immediate words work as expected, so the appropriate - ~ conversion is host-address-space-to-target. - latest @ host-address-space-to-target pack64 - 0 pack8 - 0 pack8 - + - 8 packalign - here @ latest ! - - ~ Now we're immediately after the word header, which is where the codeword - ~ will be. This is the value the label should taken on, so we set it. - dup host-address-space-to-offset - here @ 10 + + ~ dup stringlen 1 + dup 3unroll + ~ here @ 10 + 3unroll memmove + ~ here @ + + ~ ~ This value of "latest" is going into the generated output, so we need + ~ ~ to map it to the target address space. It's stored in the host address + ~ ~ space to make immediate words work as expected, so the appropriate + ~ ~ conversion is host-address-space-to-target. + ~ latest @ host-address-space-to-target pack64 + ~ 0 pack8 + ~ 0 pack8 + ~ + + ~ 8 packalign + ~ here @ latest ! + + ~ ~ Now we're immediately after the word header, which is where the codeword + ~ ~ will be. This is the value the label should taken on, so we set it. + ~ dup host-address-space-to-offset + ~ here @ 10 + + 0 swap ~ DO NOT SUBMIT swap-transform-variables intern-label set-label swap-transform-variables - here ! ; + ~ here ! + ; ~ This is the alternate version of ":" for use with the log-load transform. @@ -603,17 +633,19 @@ allocate-transform-state s" transform-state" variable ~ likely to be extremely useful to read and understand ":" in interpret.e ~ before attempting to understand "log-load:". : log-load: - ~ This calls "log-load-create" instead of "create". + ~ ~ This calls "log-load-create" instead of "create". word value@ log-load-create dropstring ~ This looks up "docol" by label. - swap-transform-variables - L@' docol - L@' origin - swap-transform-variables - + , + ~ swap-transform-variables + ~ L@' docol + ~ L@' origin + ~ swap-transform-variables + ~ + , - latest @ hide-entry ] ; + ~ TODO note no hiding the entry + ] + ; ~ This is the alternate version of ";" for use with the log-load transform. @@ -621,16 +653,16 @@ allocate-transform-state s" transform-state" variable ~ likely to be extremely useful to read and understand ";" in interpret.e ~ before attempting to understand "log-load;". : log-load; - ~ This looks up "exit" by label. - swap-transform-variables - L@' exit - swap-transform-variables - offset-to-target-address-space , + ~ ~ This looks up "exit" by label. + ~ swap-transform-variables + ~ L@' exit + ~ swap-transform-variables + ~ offset-to-target-address-space , - latest @ unhide-entry + ~ latest @ unhide-entry - ~ Since [ is an immediate word, we have to go to extra trouble to compile - ~ it as part of ;. + ~ ~ Since [ is an immediate word, we have to go to extra trouble to compile + ~ ~ it as part of ;. [ ' [ entry-to-execution-token , ] ; make-immediate @@ -640,15 +672,15 @@ allocate-transform-state s" transform-state" variable ~ below. It is likely to be extremely useful to read and understand ";asm" in ~ interpret.e before attempting to understand "log-load;asm". : log-load;asm - here @ pack-next 8 packalign here ! - latest @ dup unhide-entry entry-to-execution-token - ~ The codeword needs to be transformed to the target address space. - dup 8 + host-address-space-to-target - swap ! - - ~ Since [ is an immediate word, we have to go to extra trouble to compile - ~ it as part of ;asm. - [ ' [ entry-to-execution-token , ] + ~ here @ pack-next 8 packalign here ! + ~ latest @ dup unhide-entry entry-to-execution-token + ~ ~ The codeword needs to be transformed to the target address space. + ~ dup 8 + host-address-space-to-target + ~ swap ! + + ~ ~ Since [ is an immediate word, we have to go to extra trouble to compile + ~ ~ it as part of ;asm. + ~ [ ' [ entry-to-execution-token , ] ; make-immediate ~ This implements the log-load transform for a single word. It is directly @@ -669,8 +701,6 @@ allocate-transform-state s" transform-state" variable ~ (string) value@ - dup emitstring newline - ~ If it's the magic word, end the transformation. dup s" pyrzqxgl" stringcmp 0 = { drop dropstring 1 exit } if @@ -695,6 +725,7 @@ allocate-transform-state s" transform-state" variable dropstring-with-result entry-to-execution-token execute 0 exit } if + drop ~ (name as stack string) ~ Now we might have a compiled word, an immediate word, or an integer @@ -711,6 +742,8 @@ allocate-transform-state s" transform-state" variable ~ It's a number. dropstring-with-result + drop ~ TODO placeholder + interpreter-flags @ 0x01 & { ~ We're in compile mode, so we want to generate code which will compile ~ the number. @@ -721,24 +754,40 @@ allocate-transform-state s" transform-state" variable ~ We're in interpret mode, so we want to generate code which will push the ~ number to the stack. ~ TODO + swap-transform-variables L@' lit swap-transform-variables + offset-to-target-address-space , , 0 exit } if - drop ~ (name as stack string) ~ We know it's a regular word, and we're assuming it will exist at ~ runtime. We of course have no way to check what flags it will have, which ~ means immediate words don't work with this transform. We still treat it ~ differently based on whether we're in compile mode. - interpreter-flags @ 0x01 & { - ~ We're in compile mode. We compile code that compiles the word. - ~ TODO - dropstring 0 exit - } if + ~ interpreter-flags @ 0x01 & { + ~ ~ We're in compile mode. We compile code that compiles the word. + ~ ~ TODO + ~ dropstring 0 exit + ~ } if ~ (name as stack string) - ~ We're in immediate mode. We compile code that runs the word immediately. + ~ We're in immediate mode. We compile code that runs the word immediately. + ~ We check whether there's a label for the word; if there is, we output + ~ that. Otherwise we output code that looks it up and runs it. ~ TODO + value@ + swap-transform-variables + ~ Looking these up in reverse order saves us some stack juggling. Does + ~ help readability, or hurt it? Who can say... + L@' execute + L@' log-load-find-execution-token + L@' litstring + swap-transform-variables + offset-to-target-address-space , ~ litstring + 3roll here @ swap packstring 8 packalign here ! + offset-to-target-address-space , ~ log-load-find-execution-token + offset-to-target-address-space , ~ execute + ~ There's no such thing as not finding the word, with this transform. So ~ we just exit. @@ -749,8 +798,6 @@ allocate-transform-state s" transform-state" variable ~ an input string. It is directly analogous to "quit", in interpret.e, but is ~ far more complex. ~ -~ TODO TODO TODO this is just a stub, right now it's just a copy of the label -~ transform ~ (output buffer start, output point, input string pointer ~ -- output buffer start, output point) : log-load-transform |