diff options
Diffstat (limited to 'execution.e')
| -rw-r--r-- | execution.e | 62 |
1 files changed, 31 insertions, 31 deletions
diff --git a/execution.e b/execution.e index 2c8b869..e1b894d 100644 --- a/execution.e +++ b/execution.e @@ -27,7 +27,7 @@ ~ ~ We adopt this model of words, codewords, and variables-as-words. It's ~ really nice how it doesn't force anything else on us, not even a heap, -~ though we do end up using a heap. +~ though we do end up using the log as a sort-of heap. ~ ~ We specifically implement a version of calling and returning that Forth ~ calls indirect threaded code: The control stack is a stack of pointers @@ -48,8 +48,8 @@ ~ Notionally, we could consider not having a dictionary, and not giving ~ our words names. However, it feels silly to stop when we're so close to ~ being a full Forth, and using names for things solves a bootstrapping -~ problem related to heap management - see the write-up of _start about how -~ the heap is created, below. So, we add an additional header before the +~ problem related to log management - see the write-up of cold-start about +~ how the log is created, below. So, we add an additional header before the ~ codeword for this purpose. ~ ~ The Forth dictionary is usually a linked list of every word that has @@ -153,8 +153,8 @@ ~ ~ These are used in cold-start, just below. -: heap-requested-address 0x0000001000000000 ; ~ (very arbitrary) -: heap-size 0x0000000001000000 ; ~ 16 MiB +: log-requested-address 0x0000001000000000 ; ~ (very arbitrary) +: log-size 0x0000000001000000 ; ~ 16 MiB : control-stack-size 0x10000 ; ~ 64 KiB @@ -205,14 +205,14 @@ ~ ~ Stack out: ~ -~ * The value of "heap", as a pointer +~ * The value of "log", as a pointer ~ The meaning of this will be explained below. ~ ~ Registers within: ~ -~ * rdi points to the base the heap was allocated at, once it exists -~ This is the same value that "heap" will hold, once we reach a point -~ where we have variables. Of course, variables are stored on the heap, +~ * rdi points to the base the log was allocated at, once it exists +~ This is the same value that "log" will hold, once we reach a point +~ where we have variables. Of course, variables are stored on the log, ~ hence this temporary measure. ~ ~ We also take this opportunity to define soeme memory layout parameters @@ -224,7 +224,7 @@ current-offset L' cold-start set-label cld ~ clear the DF flag - ~ Prepare the heap. + ~ Prepare the log. ~ ~ We could ask for a data segment in the program header, but where's the ~ fun in that? Instead, we call mmap(). @@ -237,8 +237,8 @@ ~ interoperating with other runtimes. ~ 9 :rax mov-reg64-imm64 ~ mmap() - heap-requested-address :rdi mov-reg64-imm64 ~ address (very arbitrary) - heap-size :rsi mov-reg64-imm64 ~ size (one meg) + log-requested-address :rdi mov-reg64-imm64 ~ address (very arbitrary) + log-size :rsi mov-reg64-imm64 ~ size (one meg) 0x07 :rdx mov-reg64-imm64 ~ protection (read+write+exec) 0x22 :r10 mov-extrareg64-imm64 ~ flags (private+anonymous) 0 :r8 mov-extrareg64-imm64 ~ file descriptor (ignored) @@ -251,20 +251,20 @@ ~ are widely-used names for the physical tops (logical bottoms) of the ~ value and control stacks, respectively, and we will eventually set those ~ up as well, so we should keep those names in mind. The control stack - ~ lives within the heap, while the value stack is its own segment. This + ~ lives within the log, while the value stack is its own segment. This ~ value, though, is the physical bottom of the segment, meaning that it ~ stays the same even as we allocate and deallocate things within it. This ~ is unlike the two stack pointers, so we give it a name that doesn't - ~ suggest similarity: "heap". + ~ suggest similarity: "log". ~ ~ Once Forth is fully set up, its internal variables will be accessed - ~ through variable-words like any other Forth data, including "heap". To + ~ through variable-words like any other Forth data, including "log". To ~ get to that point, though, we need to be able to hold onto variable data - ~ between now and then. In fact, if we don't have at least one of "heap" + ~ between now and then. In fact, if we don't have at least one of "log" ~ and "here" (its counterpart which points to the logical top end), all ~ our efforts to hold onto anything seem a bit doomed. ~ - ~ So, we temporarily dedicate rdi to "heap" - only within this routine - + ~ So, we temporarily dedicate rdi to "log" - only within this routine - ~ and store everything else in ways that let us find things by reference ~ to it. We choose rdi because it works with the indexing modes we care ~ about, and its name suggests its function. @@ -274,7 +274,7 @@ ~ pre-allocated objects in the data segment. We are our own linker, and we ~ don't care to have a data segment. Hence, this approach. ~ - ~ Keying things off "heap" is the fundamental decision, but to make sure + ~ Keying things off "log" is the fundamental decision, but to make sure ~ our variables are accessible both during early bootstrapping, and later, ~ we also have to be thoughtful about data structures. More on that in a ~ moment. @@ -293,17 +293,17 @@ ~ :rdi control-stack-size :rbp lea-reg64-disp32-reg64 - ~ Now we save some stuff onto the heap. These are the locations that + ~ Now we save some stuff onto the log. These are the locations that ~ will eventually be the backing stores of the Forth variables, but we ~ don't create the word headers yet, since there's no requirement that ~ they be next to the backing stores. We'll do that later, once we have ~ word-writing infrastructure in place. For now, we just use their offsets - ~ relative to the physical bottom of the heap, which are fixed. + ~ relative to the physical bottom of the log, which are fixed. ~ ~ These will be the permanent homes of these values, though we have ~ copies of them elsewhere while we're still in this routine. ~ - :rdi control-stack-size 0x00 + :rdi mov-reg64-disp32-reg64 ~ heap + :rdi control-stack-size 0x00 + :rdi mov-reg64-disp32-reg64 ~ log :rsp control-stack-size 0x08 + :rdi mov-reg64-disp32-reg64 ~ s0 :rbp control-stack-size 0x10 + :rdi mov-reg64-disp32-reg64 ~ r0 L@' final-word-name :rax mov-reg64-imm64 @@ -311,8 +311,8 @@ :rdi control-stack-size 0x28 + :rax lea-reg64-disp32-reg64 :rax control-stack-size 0x20 + :rdi mov-reg64-disp32-reg64 ~ here ~ - ~ * "heap" is the physical bottom of the heap - ~ The heap grows upwards in memory, so this is also the logical + ~ * "log" is the physical bottom of the log + ~ The log grows upwards in memory, so this is also the logical ~ bottom. This comes from the address mmap() just returned to us. ~ * "s0" is the logical bottom of the value stack ~ The value stack grows downwards in memory, so this is the physical @@ -320,10 +320,10 @@ ~ with. ~ * "r0" is the logical bottom of the control stack ~ The control stack also grows downwards, so this is its pysical top - ~ as well. We allocate this dedicated space within the heap right here, + ~ as well. We allocate this dedicated space within the log right here, ~ in this routine, through our choice of where to put things. - ~ * "here" is the physical start of the unallocated space in the heap - ~ We allocate heap space from bottom to top, by incrementing this + ~ * "here" is the physical start of the unallocated space in the log + ~ We allocate log space from bottom to top, by incrementing this ~ value. So, it would also be accurate to say that it points immediately ~ after the physical top of the allocated space. At any rate, the ~ address it points to is the first one that hasn't been used yet. @@ -350,8 +350,8 @@ ~ ~ A little more detail about why we offset everything by ~ control_stack_size: We're carving out some space at the bottom of the - ~ heap - which grows low-to-high - to be the control stack - which grows - ~ high-to-low. So the control stack is allocated out of the heap as a + ~ log - which grows low-to-high - to be the control stack - which grows + ~ high-to-low. So the control stack is allocated out of the log as a ~ fixed-size, one-time thing, and then the variables come immediately ~ after that. We do need to use 32-bit displacement indexing to access ~ them this way, but that's no big deal. @@ -364,9 +364,9 @@ ~ headers that point at them, but now we're almost ready to switch to ~ proper threaded-execution, so we finish that setup first... - ~ Push the value of "heap" onto the value stack so that it can be the + ~ Push the value of "log" onto the value stack so that it can be the ~ breadcrumb the threaded code needs to find... the backing store of - ~ "heap". Yes, self-reference can be weird like that sometimes. There's + ~ "log". Yes, self-reference can be weird like that sometimes. There's ~ nothing stopping "quit" from reading rdi, it just violates the ~ abstraction... :rdi push-reg64 @@ -417,7 +417,7 @@ ~ ASLR, or embedding into other processes that impose their own addressing ~ constraints, or even coexisting with multiple versions of ourselves. ~ That choice does mean we have the hard version of this bootstrapping - ~ problem, and copying ourselves to the heap is how we solve it. + ~ problem, and copying ourselves to the log is how we solve it. ~ ~ We do have the log address right now, though that won't last. In case ~ it's unclear why not: keeping it on the stack would require all future |