diff options
Diffstat (limited to 'execution.e')
| -rw-r--r-- | execution.e | 143 |
1 files changed, 47 insertions, 96 deletions
diff --git a/execution.e b/execution.e index daacddb..1b9e84d 100644 --- a/execution.e +++ b/execution.e @@ -484,16 +484,11 @@ 3unroll 8 packalign current-offset L!' warm-start - 3roll - - log-load-transform - - ~ TODO this is tied to the specific example in evoke - ~ L@' happy-path L@' origin + pack64 + ~ (input string pointer, output buffer start, output point) ~ Before handing off to us, cold-start pushed a single value onto the - ~ stack, a pointer to the beginning of the heap. Now, we load our entire - ~ Forth implementation onto that heap, beginning with the minimal set of + ~ stack, a pointer to the beginning of the log. Now, we load our entire + ~ Forth implementation onto that log, beginning with the minimal set of ~ words needed to define more words. We do this because we need variables as ~ infrastructure so we can eventually have dynamic definitions. ~ @@ -514,117 +509,73 @@ ~ That choice does mean we have the hard version of this bootstrapping ~ problem, and copying ourselves to the heap is how we solve it. ~ - ~ We do have the heap address right now, though that won't last. In case + ~ We do have the log address right now, though that won't last. In case ~ it's unclear why not: keeping it on the stack would require all future ~ references to walk the stack, and somehow know when they've reached the ~ bottom. The stack is a good place to keep things with clearly delimited ~ lifetimes and visibility, but when we want something to live for our ~ entire program and be easy to find from any code within it, we need to ~ do something else. Anyway, since we have the address, we can use it for - ~ the next little bit of setup. + ~ the next little bit of setup. We have a bunch of helper words, from + ~ log-load.e, which make this easier. ~ ~ The first few words we define are our variables, which hardcode the ~ addresses they will return - but since we're doing this at runtime, - ~ "hardcoding" can reflect where our heap is. This is the fundamental - ~ trick that makes the heap usable. + ~ "hardcoding" can reflect where our log is. This is the fundamental + ~ trick that makes the log usable. ~ ~ One more thing to notice: We already allocated the backing stores of - ~ these variables, and populated their initial values, in _start. The + ~ these variables, and populated their initial values, in cold-start. The ~ words we're defining return those same addresses for the same backing ~ stores. So, we have continuity: Stuff defined in terms of the ~ variable-words we're defining now will interoperate with the stuff that - ~ we define in the "early" way, which includes those very words. Both the - ~ early code and the later code are dealing with the same data structures, - ~ they're just using a different technique to find them. + ~ we define using the log-load helpers, which includes those very words. + ~ Both the log-load code and the later code are dealing with the same data + ~ structures, they're just using a different technique to find them. ~ ~ This is the only hardcoding we need to do; by building on top of it, ~ we will soon reach a point where the rest of the system can be defined ~ within itself. - ~ TODO These need to, like, exist first. Also they need to be referenced - ~ as labels. - ~ dq early_heap, litstring, "heap", early_variable - ~ dq early_s0, litstring, "s0", early_variable - ~ dq early_r0, litstring, "r0", early_variable - ~ dq early_latest, litstring, "latest", early_variable - ~ dq early_here, litstring, "here", early_variable - ; + L@' log-load-log offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" log" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 -~ (previous entry address, output point, name string pointer -~ -- new entry address, output point) -: output-create - 3roll dup 4 roll swap pack64 - ~ (string pointer, new entry address, output point) - 0 pack8 - 0 pack8 - roll3 packstring - ~ (new entry address, output point) - 8 packalign - ; + L@' log-load-s0 offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" s0" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 + L@' log-load-r0 offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" r0" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 -~ Routine docol -~ ~~~~~~~~~~~~~ -~ -~ Reference this via its label as the codeword of a word to make it an -~ "interpreted" word. Concretely, it saves rsi (the "instruction pointer") -~ to the control stack, takes the address of the codeword from rax and -~ increments it in-place to form the new instruction pointer, and copies -~ that to rsi. -~ -~ Having then done this, we're now in the state that normal execution -~ expects, so docol ends by it using "next" to begin the callee's execution, -~ kicking off a nested call. -~ -~ The name is said to be short for "do colon", because Forth high-level -~ code begins word definitions with a colon. -~ -~ Registers in: -~ -~ * rsi is the caller's instruction pointer -~ * rbp is the control stack pointer -~ * rax is the address of the callee's codeword -~ -~ Registers out: -~ -~ * rsi is the callee's instruction pointer -~ * rbp is the control stack pointer -~ -~ (previous entry address, output point) -: output-docol - s" docol" output-create + L@' log-load-latest offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" latest" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 - ~ Evaluated as a word, docol is a constant which returns a pointer. - L@' docol :rax mov-reg64-imm64 - :rax push-reg64 - pack-next - 8 packalign + L@' log-load-here offset-to-target-address-space pack64 + L@' litstring offset-to-target-address-space pack64 + s" here" packstring 8 packalign + L@' log-load-variable offset-to-target-address-space pack64 - ~ Since docol is not a normal word, the label points to the value we care - ~ about from the assembly side of things, which is the address we use as the - ~ codeword. - current-offset L!' docol - :rsi pack-pushcontrol - 8 :rax add-reg64-imm8 - :rax :rsi mov-reg64-reg64 - pack-next - 8 packalign - ; + ~ Having done that, nothing else needs to be defined in an unusual way, so + ~ we can go ahead and dispatch to the log-load transform, and do the rest of + ~ the code through that. + + ~ (input string pointer, output buffer start, output point) + 3roll + log-load-transform ; -~ This is the mechanism to "return" from a word interpreted by docol. -~ We pop the control stack, and then, since this is threaded execution, we -~ do the next thing the caller wants to do, by inlining "next". -~ -~ This word would work fine with the label transformation, so we could put -~ it in core.e, but we choose to define it here because it's easier to -~ understand when it's close to the rest of the execution stuff. -~ -~ (previous entry address, output point -~ -- new entry address, output point) -: output-exit - s" exit" output-create - current-offset L!' exit - :rsi pack-popcontrol - pack-next - ; + +~ Where next? +~ ~~~~~~~~~~~ +~ +~ The definitions of "docol" and "exit" are very tightly bound up with the +~ execution model. They're defined and explained in core.e, because they need +~ to be part of the build process in two different ways, like the other core +~ functionality. |