diff options
| -rw-r--r-- | evoke.e | 542 | ||||
| -rw-r--r-- | execution.e | 539 | ||||
| -rw-r--r-- | interpret.e | 236 |
3 files changed, 742 insertions, 575 deletions
diff --git a/evoke.e b/evoke.e index add1db9..529a9a4 100644 --- a/evoke.e +++ b/evoke.e @@ -1,544 +1,4 @@ -~ cat labels.e elf.e evoke.e | ./quine > evoke && chmod 755 evoke && ./evoke - -~ ~~~~~~~~~~~~~~~~~~~~~ -~ ~~ Execution model ~~ -~ ~~~~~~~~~~~~~~~~~~~~~ -~ -~ We use Forth-style dual stacks, one for values and one for control. We -~ use rsp for values, just like C does. We use rbp for the control stack, -~ which is a special Forth-y stack: These are pointers into the bodies of -~ Forth words, not return addresses. -~ -~ The choice of rsp and rbp for the stack pointers imitates Jonesforth; -~ I'm hopeful that it gives us convenient addressing modes, and will report -~ back about that when I feel that I understand the implications. -~ -~ In Forth, everything is a "word", including mutable variables. -~ Conceptually, a word is a unit of execution, which may be implemented -~ either in machine code or as an array of pointer to other words. -~ -~ This polymorphism is implemented by having each word's contents begin -~ with a "codeword", which is a pointer to machine code that "interprets" -~ the rest of the contents. In the case of words implemented in machine -~ code, the codeword points directly to that code, which is normally right -~ next to it. -~ -~ Variables, to Forth, are simply one more thing that can be executed; the -~ effect of executing a variable is to push its address onto the value -~ stack. -~ -~ We adopt this model of words, codewords, and variables-as-words. It's -~ really nice how it doesn't force anything else on us, not even a heap, -~ though we do end up using a heap. -~ -~ We specifically implement a version of calling and returning that Forth -~ calls indirect threaded code: The control stack is a stack of pointers -~ into the middle of interpreted words. The interpreter snippet, called -~ docol, implements calling. Each word is responsible for making sure -~ returning works properly. Interpreted words accomplish this by ending with -~ the word "exit", while machine-code words accomplish it by ending with a -~ verbatim snippet called "next". -~ -~ Conceptually, "next" returns, but more specifically it accomplishes this -~ by doing the caller's next dispatch for it; thus control never actually -~ goes back to the caller's interpreter after initial setup. For performance -~ reasons, "next" is always inlined, so we define it as a macro. -~ -~ The docol routine is just ordinary code, not a macro. It's defined later -~ in this file, as a label. -~ -~ Notionally, we could consider not having a dictionary, and not giving -~ our words names. However, it feels silly to stop when we're so close to -~ being a full Forth, and using names for things solves a bootstrapping -~ problem related to heap management - see the write-up of _start about how -~ the heap is created, below. So, we add an additional header before the -~ codeword for this purpose. -~ -~ The Forth dictionary is usually a linked list of every word that has -~ ever been defined, with the newest at the head; the names of words are -~ stored in string fields, often right next to the link pointer. We adopt -~ this model, with the field sizes and order shown in the quick reference -~ below. We break with Forth tradition in one way: Rather than having a -~ length field, we use a null-terminated string. Thus, there's no length -~ limit on names. This necessitates breaking out the flags (to be explained -~ later) into their own byte, rather than taking bits from the length field -~ for them. -~ -~ There's an important performance consideration: Executable words -~ reference each other by pointers to their respective codewords. However, -~ dictionary entries reference each other by pointers to their respective -~ link fields. Traversing from the link field to the codeword is easy, -~ though it's a non-constant-time operation: Just walk the string. In order -~ to make Forth words easy to "decompile", it would be nice to also have a -~ way to traverse backwards. We solve this by making the name field be -~ null-terminated at both ends. Fun, yeah? -~ -~ -~ -~ -~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -~ Quick Reference -~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -~ -~ The layout of an interpreted word: -~ -~ (overall start) -~ 0x00 - 0x07 Link (to next-oldest word) -~ 0x08 - 0x08 HM00000I Flags -~ H - hidden -~ M - metadata -~ I - immediate -~ all other bits reserved -~ (name start) -~ 0x09 - 0x09 Null byte (terminates name) -~ 0x0a - name-end - 1 Name, as UTF-8 -~ name-end - name-end Null byte (terminates name) -~ (padding start) -~ name-end + 1 - codeword-start - 1 Zero-pad to 8-byte boundary -~ (it's possible this will be zero bytes long) -~ (codeword start) -~ ... + 0x00 - ... + 0x08 Codeword (ie. address of docol) -~ (8-byte chunks) Addresses of other words -~ - ... (end) Address of "exit" word -~ -~ The layout of a machine-code word is different only from the codeword on: -~ -~ ... + 0x00 - ... + 0x08 Addresss of next byte -~ ... + 0x08 - ???? Arbitrary machine code -~ - ... (end) Inlined implementation of next -~ -~ Also, words always start at 8-byte boundaries. -~ -~ -~ REGISTER usage conventions: -~ -~ * rsi is the "instruction pointer" for the "interpreter". -~ That is, it points to some word-pointer inside an array of -~ word-pointers inside the content of the word they're part of. It always -~ points to the next word that should be executed, whose execution hasn't -~ begun yet. -~ -~ * rbp points to the top of the control stack -~ These are former values of rsi, to eventually be returned to, from -~ successively older callers as you look further up the stack. The stack -~ grows downwards in memory. Since values are kept separately, the only -~ thing on the control stack is return addresses, one per layer of call. -~ -~ * rsp points to the top of the value stack -~ The value stack has no specific format, but it grows downwards in -~ memory. In particular there's no concept of stack frames, because items -~ on the stack don't belong to any particular word; the value stack in -~ Forth is in part a mechanism for passing values between words. -~ -~ Additionally, immediately after beginning execution of a word: -~ -~ * rax points to the address of the codeword being executed -~ The value of rax is purely for the callee's benefit, and does not need -~ to be preserved. -~ -~ Other registers are purely discretionary, and are not preserved across -~ calls. -~ -~ -~ FLAG usage: -~ -~ * DF should be 0 -~ We use lodsq extensively and that makes it increment rsi after using it. -~ -~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - -~ Macro next -~ ~~~~~~~~~~ -~ -~ Include this inline at the end of a word implemented in machine-code. -~ Conceptually, it returns. What it actually does is do the next thing the -~ caller would do, which is call the next word from the caller's array of -~ word pointers. -~ -~ This is a widespread technique in Forth implementation, referred to as -~ indirect threaded code. It's "threaded" in the sense that each word takes -~ responsibility for finishing up by following the notional thread through the -~ metaphorical labyrinth to figure out the next word that its caller wants to -~ run after it. In other words, control never directly returns to the parent, -~ it proceeds directly to the sibling. -~ -~ Registers in: -~ -~ * rsi points to the address of the word to execute -~ -~ Registers out: -~ -~ * rax points to the codeword in the contents of the word that was executed -~ * rsi points to the next word-address after this one -~ -~ Flags -~ * DF = 0 is required -~ -~ (base address -- new base address) -: next - ~ Copy the next word's address from *rsi into rax. Increment rsi (as per the - ~ DF flag). - lods64 - - ~ Load the codeword from the word's contents, and jump to the interpreter it - ~ points to. - :rax jmp-abs-indirect-reg64 ; - - -~ Macro beforenext -~ ~~~~~~~~~~~~~~~~ -~ -~ Sometimes we want to transfer control from a word implemented in -~ machine-code to another word, without coming back after, as if we were -~ simply jumping to it. This is an innovation of ours; Jonesforth doesn't do -~ it. It is similar to the tail-call optimization that many Lisp dialects -~ have. -~ -~ This implementation will work regardless of how the receiving word is -~ implemented. It impersonates the "next" snippet, setting up rax to point -~ to the codeword then jumping to the interpreter. Since it doesn't change -~ the control stack or rsi, when the receiving word eventually invokes -~ "next"; it will pick up in the same place as if this sending word had done -~ it. -~ -~ Thus, notionally we are doing just this one transfer of control before -~ eventually getting around to inlining "next". Hence the name. -~ -~ (target address, base address -- new base address) -: beforenext - ~ Do a permanent transfer of control by setting rax and invoking the - ~ codeword. Of course, we could jump to docol ourselves but this will work - ~ regardless of what the receiving codeword is. - :rax mov-reg64-imm64 - :rax jmp-abs-indirect-reg64 ; - - -~ Macros pushcontrol -~ popcontrol -~ ~~~~~~~~~~~~~~~~~~ -~ -~ Include these inline to push an address onto the control stack, or pop -~ one off of it. You will recall the control stack is kept in rbp. The -~ parameter is given in a user-specified register. -~ -~ Jonesforth's analogous macros are called PUSHRSP and POPRSP but I think -~ that's super confusing, since rsp is also the name of a register, but a -~ different one. I guess it was less confusing in 32-bit, since esp doesn't -~ start with an "r". Anyway, this has to be named something that -~ distinguishes it from Intel's PUSH and POP opcodes, so... -~ -~ "Load effective address" is just a cute way to do arithmetic on a -~ register, here. To push or pop we decrement or increment rbp by 8. To -~ actually interact with the space in the stack, we indirect through rbp. -~ -~ Registers in and out: -~ -~ * rbp points to the top of the control stack. -~ -~ (source register, base address -- new base address) -: pushcontrol - swap :rbp -8 :rbp lea-reg64-disp8-reg64 - swap :rbp 0 mov-disp8-reg64-reg64 ; - -~ (target register, base address -- new base address) -: popcontrol - :rbp 0 3roll mov-reg64-disp8-reg64 - :rbp 8 :rbp lea-reg64-disp8-reg64 ; - -~ Constants -~ ~~~~~~~~~ -~ -~ These are used in cold-start, just below. - -: heap-requested-address 0x0000001000000000 ; ~ (very arbitrary) -: heap-size 0x0000000001000000 ; ~ 16 MiB -: control-stack-size 0x10000 ; ~ 64 KiB - - -~ Routine cold-start -~ ~~~~~~~~~~~~~~~~~~ -~ -~ This is the entry point of the whole program, the very first code we -~ actually execute. Linkers traditionally call this _start, but let's live a -~ little. Anyway, the ELF header points to it and exec() jumps to it; from the -~ kernel's perspective, there's no such thing as names. Also, though this code -~ could be anywhere in the code part of the output, in order to make the -~ hexdump pretty we put it at the start. -~ -~ The kernel gives us most registers zeroed, and rsp pointing to the -~ command-line stuff (argc, argv, envp), which is at an ASLR'd address with -~ some stack space allocated for us, despite the fact we didn't request any. -~ It also gives us all the flags clear except IF, but we don't rely on that. -~ Lastly, of course, it loads our code segment and sets the instruction -~ pointer where we asked; we don't need to check what those addresses are, -~ because they're not randomized. -~ -~ This routine is really only responsible for one-time initialization. -~ -~ Registers in: -~ -~ * rsp points to the logical top of the value stack -~ The kernel sets this up for us, and we need to save it somewhere so -~ Forth can use it. -~ -~ Registers out: -~ -~ * rsi points within "quit" -~ Quit is the word that's Forth's closest equivalent to main(). -~ * rsp points to the top of the value stack -~ -~ Notably, rbp is still uninitialialized after _start. -~ -~ Stack in: -~ -~ * argc, argv, envp in the usual Unix way -~ We ignore them, though. -~ -~ Stack out: -~ -~ * The value of "heap", as a pointer -~ The meaning of this will be explained below. -~ -~ Registers within: -~ -~ * rdi points to the base the heap was allocated at, once it exists -~ This is the same value that "heap" will hold, once we reach a point -~ where we have variables. Of course, variables are stored on the heap, -~ hence this temporary measure. -~ -~ We also take this opportunity to define soeme memory layout parameters -~ that this routine will be responsible for doing something with: -~ -~ (output memory start, current output point -~ -- output memory start, current output point) -: cold-start - current-offset L' cold-start set-label - cld ~ clear the DF flag - - ~ Prepare the heap. - ~ - ~ We could ask for a data segment in the program header, but where's the - ~ fun in that? Instead, we call mmap(). - ~ - ~ If we wanted the kernel to do ASLR for us, passing address zero would - ~ cause it to pick somewhere at random, but instead we choose our own - ~ location. It's still not guaranteed to be where we ask for, so we still - ~ do the work to record where it wound up. We could pass the "fixed" flag - ~ and the kernel would trust us, but this gives us more options for - ~ interoperating with other runtimes. - ~ - 9 :rax mov-reg64-imm64 ~ mmap() - heap-requested-address :rdi mov-reg64-imm64 ~ address (very arbitrary) - heap-size :rsi mov-reg64-imm64 ~ size (one meg) - 0x07 :rdx mov-reg64-imm64 ~ protection (read+write+exec) - 0x22 :r10 mov-extrareg64-imm64 ~ flags (private+anonymous) - 0 :r8 mov-extrareg64-imm64 ~ file descriptor (ignored) - 0 :r9 mov-extrareg64-imm64 ~ offset (ignored) - syscall - - ~ The return value of the system call is in rax, we'll use it in a sec. - ~ We need to save this somewhere in case we ever want to munmap() it; - ~ there's no widely-used name for it so we have to make one up. S0 and r0 - ~ are widely-used names for the physical tops (logical bottoms) of the - ~ value and control stacks, respectively, and we will eventually set those - ~ up as well, so we should keep those names in mind. The control stack - ~ lives within the heap, while the value stack is its own segment. This - ~ value, though, is the physical bottom of the segment, meaning that it - ~ stays the same even as we allocate and deallocate things within it. This - ~ is unlike the two stack pointers, so we give it a name that doesn't - ~ suggest similarity: "heap". - ~ - ~ Once Forth is fully set up, its internal variables will be accessed - ~ through variable-words like any other Forth data, including "heap". To - ~ get to that point, though, we need to be able to hold onto variable data - ~ between now and then. In fact, if we don't have at least one of "heap" - ~ and "here" (its counterpart which points to the logical top end), all - ~ our efforts to hold onto anything seem a bit doomed. - ~ - ~ So, we temporarily dedicate rdi to "heap" - only within this routine - - ~ and store everything else in ways that let us find things by reference - ~ to it. We choose rdi because it works with the indexing modes we care - ~ about, and its name suggests its function. - ~ - ~ The strategy Jonesforth uses is not applicable to us; Jonesforth - ~ takes advantage of the linker to let its code segment refer to specific, - ~ pre-allocated objects in the data segment. We are our own linker, and we - ~ don't care to have a data segment. Hence, this approach. - ~ - ~ Keying things off "heap" is the fundamental decision, but to make sure - ~ our variables are accessible both during early bootstrapping, and later, - ~ we also have to be thoughtful about data structures. More on that in a - ~ moment. - ~ - :rax :rdi mov-reg64-reg64 - - ~ We also initialize rbp. We could hold off and let "quit" do this, but - ~ doing it now is the easiest way to initialize the r0 variable, since - ~ there's no instruction that moves a 64-bit immediate to memory. - ~ - ~ This is the moment at which we decide where the control stack starts! - ~ Fun, right? "Allocation" is just a fancy word for picking where we want - ~ something, then being consistent about it - like placing furniture in - ~ your home. See below for a little more thought about why here in - ~ particular. - ~ - :rdi control-stack-size :rbp lea-reg64-disp32-reg64 - - ~ Now we save some stuff onto the heap. These are the locations that - ~ will eventually be the backing stores of the Forth variables, but we - ~ don't create the word headers yet, since there's no requirement that - ~ they be next to the backing stores. We'll do that later, once we have - ~ word-writing infrastructure in place. For now, we just use their offsets - ~ relative to the physical bottom of the heap, which are fixed. - ~ - ~ These will be the permanent homes of these values, though we have - ~ copies of them elsewhere while we're still in this routine. - ~ - :rdi control-stack-size 0x00 + :rdi mov-reg64-disp32-reg64 ~ heap - :rsp control-stack-size 0x08 + :rdi mov-reg64-disp32-reg64 ~ s0 - :rbp control-stack-size 0x10 + :rdi mov-reg64-disp32-reg64 ~ r0 - L@' final-word-name :rax mov-reg64-imm64 - :rax control-stack-size 0x18 + :rdi mov-reg64-disp32-reg64 ~ latest - :rdi control-stack-size 0x28 + :rax lea-reg64-disp32-reg64 - :rax control-stack-size 0x20 + :rdi mov-reg64-disp32-reg64 ~ here - ~ - ~ * "heap" is the physical bottom of the heap - ~ The heap grows upwards in memory, so this is also the logical - ~ bottom. This comes from the address mmap() just returned to us. - ~ * "s0" is the logical bottom of the value stack - ~ The value stack grows downwards in memory, so this is the physical - ~ top of it. This comes from the stack pointer the kernel initialized us - ~ with. - ~ * "r0" is the logical bottom of the control stack - ~ The control stack also grows downwards, so this is its pysical top - ~ as well. We allocate this dedicated space within the heap right here, - ~ in this routine, through our choice of where to put things. - ~ * "here" is the physical start of the unallocated space in the heap - ~ We allocate heap space from bottom to top, by incrementing this - ~ value. So, it would also be accurate to say that it points immediately - ~ after the physical top of the allocated space. At any rate, the - ~ address it points to is the first one that hasn't been used yet. - ~ * "latest" is the address of the most-recently-defined word's header - ~ Defining new words changes this value. - ~ - ~ s0 and r0 are mostly used when we want to initialize or reinitialize - ~ their respective stacks - that is, discard all their contents at once. - ~ - ~ The value of r0 is the same address these variables start at, so - ~ you'll want to do a close read of the implementation of pushcontrol - ~ and convince yourself that it only ever writes things just below the rbp - ~ address it receives, never right on top of it. - ~ - ~ As an aside, by the way, please notice that strictly speaking, r0 - ~ could be a constant... but it isn't known until runtime, so we might as - ~ well make it a variable. That will play nicely with any astonishing - ~ memory shenanigans someone might wish to do in the future. - ~ - ~ Notice also that "here" points immediately after itself. This is just - ~ a convenience, making it the last one like that so that the concern is - ~ dealt with in a single place and is easy to keep up-to-date with code - ~ changes. - ~ - ~ A little more detail about why we offset everything by - ~ control_stack_size: We're carving out some space at the bottom of the - ~ heap - which grows low-to-high - to be the control stack - which grows - ~ high-to-low. So the control stack is allocated out of the heap as a - ~ fixed-size, one-time thing, and then the variables come immediately - ~ after that. We do need to use 32-bit displacement indexing to access - ~ them this way, but that's no big deal. - ~ - ~ This is perhaps questionable, they should maybe be separate segments - ~ created with separate calls to mmap(), but for now we're not worried - ~ about overflow so we use the same allocation for both. - ~ - ~ We'll come back to these variables a bit later and generate the word - ~ headers that point at them, but now we're almost ready to switch to - ~ proper threaded-execution, so we finish that setup first... - - ~ Push the value of "heap" onto the value stack so that it can be the - ~ breadcrumb the threaded code needs to find... the backing store of - ~ "heap". Yes, self-reference can be weird like that sometimes. There's - ~ nothing stopping "quit" from reading rdi, it just violates the - ~ abstraction... - :rdi push-reg64 - - ~ We are about to set up rsi, we did rbp already, and rsp came to us - ~ already set up. That's all that "next" needs, so take it away! - L@' warm-start :rsi mov-reg64-imm64 - next ; - -~ Routine warm-start -~ ~~~~~~~~~~~~~~~~~~ -~ -~ This routine runs as Forth code. That is, indirect threaded execution has -~ been initialized, and the structure of this routine is an array of codeword -~ pointers. It was directly jumped to from cold-start. There's nowhere to -~ return to, so it needs to never return. -: warm-start - ~ While it's not actually a requirement that codeword pointers be - ~ word-aligned, it's highly likely that it helps performance. (Whether it - ~ does is up to Intel's microcode.) - 8 packalign - current-offset L' warm-start set-label - - ~ Before handing off to us, cold-start pushed a single value onto the - ~ stack, a pointer to the beginning of the heap. Now, we load our entire - ~ Forth implementation onto that heap, beginning with the minimal set of - ~ words needed to define more words. We do this because we need variables as - ~ infrastructure so we can eventually have dynamic definitions. - ~ - ~ There's something non-obvious here: words implemented statically as - ~ part of the executable image can't contain things that vary at runtime. - ~ That means that even if these words tried to implement some sort of - ~ dynamic lookup, they would have no way to find the root of whatever - ~ dynamic data structure they use. Dynamism needs to be bootstrapped. - ~ - ~ In a more traditional C-style program, static code could look up - ~ variables based on fixed addresses that are the same on every run. - ~ Failing that, we could dedicate a register to it, though that's a - ~ considerable expense. We chose not to do either of those things, because - ~ we want the versatility that comes with not being picky about our - ~ address space: It allows us to contemplate future improvements such as - ~ ASLR, or embedding into other processes that impose their own addressing - ~ constraints, or even coexisting with multiple versions of ourselves. - ~ That choice does mean we have the hard version of this bootstrapping - ~ problem, and copying ourselves to the heap is how we solve it. - ~ - ~ We do have the heap address right now, though that won't last. In case - ~ it's unclear why not: keeping it on the stack would require all future - ~ references to walk the stack, and somehow know when they've reached the - ~ bottom. The stack is a good place to keep things with clearly delimited - ~ lifetimes and visibility, but when we want something to live for our - ~ entire program and be easy to find from any code within it, we need to - ~ do something else. Anyway, since we have the address, we can use it for - ~ the next little bit of setup. - ~ - ~ The first few words we define are our variables, which hardcode the - ~ addresses they will return - but since we're doing this at runtime, - ~ "hardcoding" can reflect where our heap is. This is the fundamental - ~ trick that makes the heap usable. - ~ - ~ One more thing to notice: We already allocated the backing stores of - ~ these variables, and populated their initial values, in _start. The - ~ words we're defining return those same addresses for the same backing - ~ stores. So, we have continuity: Stuff defined in terms of the - ~ variable-words we're defining now will interoperate with the stuff that - ~ we define in the "early" way, which includes those very words. Both the - ~ early code and the later code are dealing with the same data structures, - ~ they're just using a different technique to find them. - ~ - ~ This is the only hardcoding we need to do; by building on top of it, - ~ we will soon reach a point where the rest of the system can be defined - ~ within itself. - ~ TODO These need to, like, exist first. Also they need to be referenced - ~ as labels. - ~ dq early_heap, litstring, "heap", early_variable - ~ dq early_s0, litstring, "s0", early_variable - ~ dq early_r0, litstring, "r0", early_variable - ~ dq early_latest, litstring, "latest", early_variable - ~ dq early_here, litstring, "here", early_variable - ; +~ cat labels.e elf.e execution.e evoke.e | ./quine > evoke && chmod 755 evoke && ./evoke ~ (output memory start, current output point ~ -- output memory start, current output point) diff --git a/execution.e b/execution.e new file mode 100644 index 0000000..f6e5978 --- /dev/null +++ b/execution.e @@ -0,0 +1,539 @@ +~ ~~~~~~~~~~~~~~~~~~~~~ +~ ~~ Execution model ~~ +~ ~~~~~~~~~~~~~~~~~~~~~ +~ +~ We use Forth-style dual stacks, one for values and one for control. We +~ use rsp for values, just like C does. We use rbp for the control stack, +~ which is a special Forth-y stack: These are pointers into the bodies of +~ Forth words, not return addresses. +~ +~ The choice of rsp and rbp for the stack pointers imitates Jonesforth; +~ I'm hopeful that it gives us convenient addressing modes, and will report +~ back about that when I feel that I understand the implications. +~ +~ In Forth, everything is a "word", including mutable variables. +~ Conceptually, a word is a unit of execution, which may be implemented +~ either in machine code or as an array of pointer to other words. +~ +~ This polymorphism is implemented by having each word's contents begin +~ with a "codeword", which is a pointer to machine code that "interprets" +~ the rest of the contents. In the case of words implemented in machine +~ code, the codeword points directly to that code, which is normally right +~ next to it. +~ +~ Variables, to Forth, are simply one more thing that can be executed; the +~ effect of executing a variable is to push its address onto the value +~ stack. +~ +~ We adopt this model of words, codewords, and variables-as-words. It's +~ really nice how it doesn't force anything else on us, not even a heap, +~ though we do end up using a heap. +~ +~ We specifically implement a version of calling and returning that Forth +~ calls indirect threaded code: The control stack is a stack of pointers +~ into the middle of interpreted words. The interpreter snippet, called +~ docol, implements calling. Each word is responsible for making sure +~ returning works properly. Interpreted words accomplish this by ending with +~ the word "exit", while machine-code words accomplish it by ending with a +~ verbatim snippet called "next". +~ +~ Conceptually, "next" returns, but more specifically it accomplishes this +~ by doing the caller's next dispatch for it; thus control never actually +~ goes back to the caller's interpreter after initial setup. For performance +~ reasons, "next" is always inlined, so we define it as a macro. +~ +~ The docol routine is just ordinary code, not a macro. It's defined later +~ in this file, as a label. +~ +~ Notionally, we could consider not having a dictionary, and not giving +~ our words names. However, it feels silly to stop when we're so close to +~ being a full Forth, and using names for things solves a bootstrapping +~ problem related to heap management - see the write-up of _start about how +~ the heap is created, below. So, we add an additional header before the +~ codeword for this purpose. +~ +~ The Forth dictionary is usually a linked list of every word that has +~ ever been defined, with the newest at the head; the names of words are +~ stored in string fields, often right next to the link pointer. We adopt +~ this model, with the field sizes and order shown in the quick reference +~ below. We break with Forth tradition in one way: Rather than having a +~ length field, we use a null-terminated string. Thus, there's no length +~ limit on names. This necessitates breaking out the flags (to be explained +~ later) into their own byte, rather than taking bits from the length field +~ for them. +~ +~ There's an important performance consideration: Executable words +~ reference each other by pointers to their respective codewords. However, +~ dictionary entries reference each other by pointers to their respective +~ link fields. Traversing from the link field to the codeword is easy, +~ though it's a non-constant-time operation: Just walk the string. In order +~ to make Forth words easy to "decompile", it would be nice to also have a +~ way to traverse backwards. We solve this by making the name field be +~ null-terminated at both ends. Fun, yeah? +~ +~ +~ +~ +~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~ Quick Reference +~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~ +~ The layout of an interpreted word: +~ +~ (overall start) +~ 0x00 - 0x07 Link (to next-oldest word) +~ 0x08 - 0x08 HM00000I Flags +~ H - hidden +~ M - metadata +~ I - immediate +~ all other bits reserved +~ (name start) +~ 0x09 - 0x09 Null byte (terminates name) +~ 0x0a - name-end - 1 Name, as UTF-8 +~ name-end - name-end Null byte (terminates name) +~ (padding start) +~ name-end + 1 - codeword-start - 1 Zero-pad to 8-byte boundary +~ (it's possible this will be zero bytes long) +~ (codeword start) +~ ... + 0x00 - ... + 0x08 Codeword (ie. address of docol) +~ (8-byte chunks) Addresses of other words +~ - ... (end) Address of "exit" word +~ +~ The layout of a machine-code word is different only from the codeword on: +~ +~ ... + 0x00 - ... + 0x08 Addresss of next byte +~ ... + 0x08 - ???? Arbitrary machine code +~ - ... (end) Inlined implementation of next +~ +~ Also, words always start at 8-byte boundaries. +~ +~ +~ REGISTER usage conventions: +~ +~ * rsi is the "instruction pointer" for the "interpreter". +~ That is, it points to some word-pointer inside an array of +~ word-pointers inside the content of the word they're part of. It always +~ points to the next word that should be executed, whose execution hasn't +~ begun yet. +~ +~ * rbp points to the top of the control stack +~ These are former values of rsi, to eventually be returned to, from +~ successively older callers as you look further up the stack. The stack +~ grows downwards in memory. Since values are kept separately, the only +~ thing on the control stack is return addresses, one per layer of call. +~ +~ * rsp points to the top of the value stack +~ The value stack has no specific format, but it grows downwards in +~ memory. In particular there's no concept of stack frames, because items +~ on the stack don't belong to any particular word; the value stack in +~ Forth is in part a mechanism for passing values between words. +~ +~ Additionally, immediately after beginning execution of a word: +~ +~ * rax points to the address of the codeword being executed +~ The value of rax is purely for the callee's benefit, and does not need +~ to be preserved. +~ +~ Other registers are purely discretionary, and are not preserved across +~ calls. +~ +~ +~ FLAG usage: +~ +~ * DF should be 0 +~ We use lodsq extensively and that makes it increment rsi after using it. +~ +~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + +~ Macro next +~ ~~~~~~~~~~ +~ +~ Include this inline at the end of a word implemented in machine-code. +~ Conceptually, it returns. What it actually does is do the next thing the +~ caller would do, which is call the next word from the caller's array of +~ word pointers. +~ +~ This is a widespread technique in Forth implementation, referred to as +~ indirect threaded code. It's "threaded" in the sense that each word takes +~ responsibility for finishing up by following the notional thread through the +~ metaphorical labyrinth to figure out the next word that its caller wants to +~ run after it. In other words, control never directly returns to the parent, +~ it proceeds directly to the sibling. +~ +~ Registers in: +~ +~ * rsi points to the address of the word to execute +~ +~ Registers out: +~ +~ * rax points to the codeword in the contents of the word that was executed +~ * rsi points to the next word-address after this one +~ +~ Flags +~ * DF = 0 is required +~ +~ (base address -- new base address) +: next + ~ Copy the next word's address from *rsi into rax. Increment rsi (as per the + ~ DF flag). + lods64 + + ~ Load the codeword from the word's contents, and jump to the interpreter it + ~ points to. + :rax jmp-abs-indirect-reg64 ; + + +~ Macro beforenext +~ ~~~~~~~~~~~~~~~~ +~ +~ Sometimes we want to transfer control from a word implemented in +~ machine-code to another word, without coming back after, as if we were +~ simply jumping to it. This is an innovation of ours; Jonesforth doesn't do +~ it. It is similar to the tail-call optimization that many Lisp dialects +~ have. +~ +~ This implementation will work regardless of how the receiving word is +~ implemented. It impersonates the "next" snippet, setting up rax to point +~ to the codeword then jumping to the interpreter. Since it doesn't change +~ the control stack or rsi, when the receiving word eventually invokes +~ "next"; it will pick up in the same place as if this sending word had done +~ it. +~ +~ Thus, notionally we are doing just this one transfer of control before +~ eventually getting around to inlining "next". Hence the name. +~ +~ (target address, base address -- new base address) +: beforenext + ~ Do a permanent transfer of control by setting rax and invoking the + ~ codeword. Of course, we could jump to docol ourselves but this will work + ~ regardless of what the receiving codeword is. + :rax mov-reg64-imm64 + :rax jmp-abs-indirect-reg64 ; + + +~ Macros pushcontrol +~ popcontrol +~ ~~~~~~~~~~~~~~~~~~ +~ +~ Include these inline to push an address onto the control stack, or pop +~ one off of it. You will recall the control stack is kept in rbp. The +~ parameter is given in a user-specified register. +~ +~ Jonesforth's analogous macros are called PUSHRSP and POPRSP but I think +~ that's super confusing, since rsp is also the name of a register, but a +~ different one. I guess it was less confusing in 32-bit, since esp doesn't +~ start with an "r". Anyway, this has to be named something that +~ distinguishes it from Intel's PUSH and POP opcodes, so... +~ +~ "Load effective address" is just a cute way to do arithmetic on a +~ register, here. To push or pop we decrement or increment rbp by 8. To +~ actually interact with the space in the stack, we indirect through rbp. +~ +~ Registers in and out: +~ +~ * rbp points to the top of the control stack. +~ +~ (source register, base address -- new base address) +: pushcontrol + swap :rbp -8 :rbp lea-reg64-disp8-reg64 + swap :rbp 0 mov-disp8-reg64-reg64 ; + +~ (target register, base address -- new base address) +: popcontrol + :rbp 0 3roll mov-reg64-disp8-reg64 + :rbp 8 :rbp lea-reg64-disp8-reg64 ; + +~ Constants +~ ~~~~~~~~~ +~ +~ These are used in cold-start, just below. + +: heap-requested-address 0x0000001000000000 ; ~ (very arbitrary) +: heap-size 0x0000000001000000 ; ~ 16 MiB +: control-stack-size 0x10000 ; ~ 64 KiB + + +~ Routine cold-start +~ ~~~~~~~~~~~~~~~~~~ +~ +~ This is the entry point of the whole program, the very first code we +~ actually execute. Linkers traditionally call this _start, but let's live a +~ little. Anyway, the ELF header points to it and exec() jumps to it; from the +~ kernel's perspective, there's no such thing as names. Also, though this code +~ could be anywhere in the code part of the output, in order to make the +~ hexdump pretty we put it at the start. +~ +~ The kernel gives us most registers zeroed, and rsp pointing to the +~ command-line stuff (argc, argv, envp), which is at an ASLR'd address with +~ some stack space allocated for us, despite the fact we didn't request any. +~ It also gives us all the flags clear except IF, but we don't rely on that. +~ Lastly, of course, it loads our code segment and sets the instruction +~ pointer where we asked; we don't need to check what those addresses are, +~ because they're not randomized. +~ +~ This routine is really only responsible for one-time initialization. +~ +~ Registers in: +~ +~ * rsp points to the logical top of the value stack +~ The kernel sets this up for us, and we need to save it somewhere so +~ Forth can use it. +~ +~ Registers out: +~ +~ * rsi points within "quit" +~ Quit is the word that's Forth's closest equivalent to main(). +~ * rsp points to the top of the value stack +~ +~ Notably, rbp is still uninitialialized after _start. +~ +~ Stack in: +~ +~ * argc, argv, envp in the usual Unix way +~ We ignore them, though. +~ +~ Stack out: +~ +~ * The value of "heap", as a pointer +~ The meaning of this will be explained below. +~ +~ Registers within: +~ +~ * rdi points to the base the heap was allocated at, once it exists +~ This is the same value that "heap" will hold, once we reach a point +~ where we have variables. Of course, variables are stored on the heap, +~ hence this temporary measure. +~ +~ We also take this opportunity to define soeme memory layout parameters +~ that this routine will be responsible for doing something with: +~ +~ (output memory start, current output point +~ -- output memory start, current output point) +: cold-start + current-offset L' cold-start set-label + cld ~ clear the DF flag + + ~ Prepare the heap. + ~ + ~ We could ask for a data segment in the program header, but where's the + ~ fun in that? Instead, we call mmap(). + ~ + ~ If we wanted the kernel to do ASLR for us, passing address zero would + ~ cause it to pick somewhere at random, but instead we choose our own + ~ location. It's still not guaranteed to be where we ask for, so we still + ~ do the work to record where it wound up. We could pass the "fixed" flag + ~ and the kernel would trust us, but this gives us more options for + ~ interoperating with other runtimes. + ~ + 9 :rax mov-reg64-imm64 ~ mmap() + heap-requested-address :rdi mov-reg64-imm64 ~ address (very arbitrary) + heap-size :rsi mov-reg64-imm64 ~ size (one meg) + 0x07 :rdx mov-reg64-imm64 ~ protection (read+write+exec) + 0x22 :r10 mov-extrareg64-imm64 ~ flags (private+anonymous) + 0 :r8 mov-extrareg64-imm64 ~ file descriptor (ignored) + 0 :r9 mov-extrareg64-imm64 ~ offset (ignored) + syscall + + ~ The return value of the system call is in rax, we'll use it in a sec. + ~ We need to save this somewhere in case we ever want to munmap() it; + ~ there's no widely-used name for it so we have to make one up. S0 and r0 + ~ are widely-used names for the physical tops (logical bottoms) of the + ~ value and control stacks, respectively, and we will eventually set those + ~ up as well, so we should keep those names in mind. The control stack + ~ lives within the heap, while the value stack is its own segment. This + ~ value, though, is the physical bottom of the segment, meaning that it + ~ stays the same even as we allocate and deallocate things within it. This + ~ is unlike the two stack pointers, so we give it a name that doesn't + ~ suggest similarity: "heap". + ~ + ~ Once Forth is fully set up, its internal variables will be accessed + ~ through variable-words like any other Forth data, including "heap". To + ~ get to that point, though, we need to be able to hold onto variable data + ~ between now and then. In fact, if we don't have at least one of "heap" + ~ and "here" (its counterpart which points to the logical top end), all + ~ our efforts to hold onto anything seem a bit doomed. + ~ + ~ So, we temporarily dedicate rdi to "heap" - only within this routine - + ~ and store everything else in ways that let us find things by reference + ~ to it. We choose rdi because it works with the indexing modes we care + ~ about, and its name suggests its function. + ~ + ~ The strategy Jonesforth uses is not applicable to us; Jonesforth + ~ takes advantage of the linker to let its code segment refer to specific, + ~ pre-allocated objects in the data segment. We are our own linker, and we + ~ don't care to have a data segment. Hence, this approach. + ~ + ~ Keying things off "heap" is the fundamental decision, but to make sure + ~ our variables are accessible both during early bootstrapping, and later, + ~ we also have to be thoughtful about data structures. More on that in a + ~ moment. + ~ + :rax :rdi mov-reg64-reg64 + + ~ We also initialize rbp. We could hold off and let "quit" do this, but + ~ doing it now is the easiest way to initialize the r0 variable, since + ~ there's no instruction that moves a 64-bit immediate to memory. + ~ + ~ This is the moment at which we decide where the control stack starts! + ~ Fun, right? "Allocation" is just a fancy word for picking where we want + ~ something, then being consistent about it - like placing furniture in + ~ your home. See below for a little more thought about why here in + ~ particular. + ~ + :rdi control-stack-size :rbp lea-reg64-disp32-reg64 + + ~ Now we save some stuff onto the heap. These are the locations that + ~ will eventually be the backing stores of the Forth variables, but we + ~ don't create the word headers yet, since there's no requirement that + ~ they be next to the backing stores. We'll do that later, once we have + ~ word-writing infrastructure in place. For now, we just use their offsets + ~ relative to the physical bottom of the heap, which are fixed. + ~ + ~ These will be the permanent homes of these values, though we have + ~ copies of them elsewhere while we're still in this routine. + ~ + :rdi control-stack-size 0x00 + :rdi mov-reg64-disp32-reg64 ~ heap + :rsp control-stack-size 0x08 + :rdi mov-reg64-disp32-reg64 ~ s0 + :rbp control-stack-size 0x10 + :rdi mov-reg64-disp32-reg64 ~ r0 + L@' final-word-name :rax mov-reg64-imm64 + :rax control-stack-size 0x18 + :rdi mov-reg64-disp32-reg64 ~ latest + :rdi control-stack-size 0x28 + :rax lea-reg64-disp32-reg64 + :rax control-stack-size 0x20 + :rdi mov-reg64-disp32-reg64 ~ here + ~ + ~ * "heap" is the physical bottom of the heap + ~ The heap grows upwards in memory, so this is also the logical + ~ bottom. This comes from the address mmap() just returned to us. + ~ * "s0" is the logical bottom of the value stack + ~ The value stack grows downwards in memory, so this is the physical + ~ top of it. This comes from the stack pointer the kernel initialized us + ~ with. + ~ * "r0" is the logical bottom of the control stack + ~ The control stack also grows downwards, so this is its pysical top + ~ as well. We allocate this dedicated space within the heap right here, + ~ in this routine, through our choice of where to put things. + ~ * "here" is the physical start of the unallocated space in the heap + ~ We allocate heap space from bottom to top, by incrementing this + ~ value. So, it would also be accurate to say that it points immediately + ~ after the physical top of the allocated space. At any rate, the + ~ address it points to is the first one that hasn't been used yet. + ~ * "latest" is the address of the most-recently-defined word's header + ~ Defining new words changes this value. + ~ + ~ s0 and r0 are mostly used when we want to initialize or reinitialize + ~ their respective stacks - that is, discard all their contents at once. + ~ + ~ The value of r0 is the same address these variables start at, so + ~ you'll want to do a close read of the implementation of pushcontrol + ~ and convince yourself that it only ever writes things just below the rbp + ~ address it receives, never right on top of it. + ~ + ~ As an aside, by the way, please notice that strictly speaking, r0 + ~ could be a constant... but it isn't known until runtime, so we might as + ~ well make it a variable. That will play nicely with any astonishing + ~ memory shenanigans someone might wish to do in the future. + ~ + ~ Notice also that "here" points immediately after itself. This is just + ~ a convenience, making it the last one like that so that the concern is + ~ dealt with in a single place and is easy to keep up-to-date with code + ~ changes. + ~ + ~ A little more detail about why we offset everything by + ~ control_stack_size: We're carving out some space at the bottom of the + ~ heap - which grows low-to-high - to be the control stack - which grows + ~ high-to-low. So the control stack is allocated out of the heap as a + ~ fixed-size, one-time thing, and then the variables come immediately + ~ after that. We do need to use 32-bit displacement indexing to access + ~ them this way, but that's no big deal. + ~ + ~ This is perhaps questionable, they should maybe be separate segments + ~ created with separate calls to mmap(), but for now we're not worried + ~ about overflow so we use the same allocation for both. + ~ + ~ We'll come back to these variables a bit later and generate the word + ~ headers that point at them, but now we're almost ready to switch to + ~ proper threaded-execution, so we finish that setup first... + + ~ Push the value of "heap" onto the value stack so that it can be the + ~ breadcrumb the threaded code needs to find... the backing store of + ~ "heap". Yes, self-reference can be weird like that sometimes. There's + ~ nothing stopping "quit" from reading rdi, it just violates the + ~ abstraction... + :rdi push-reg64 + + ~ We are about to set up rsi, we did rbp already, and rsp came to us + ~ already set up. That's all that "next" needs, so take it away! + L@' warm-start :rsi mov-reg64-imm64 + next ; + +~ Routine warm-start +~ ~~~~~~~~~~~~~~~~~~ +~ +~ This routine runs as Forth code. That is, indirect threaded execution has +~ been initialized, and the structure of this routine is an array of codeword +~ pointers. It was directly jumped to from cold-start. There's nowhere to +~ return to, so it needs to never return. +: warm-start + ~ While it's not actually a requirement that codeword pointers be + ~ word-aligned, it's highly likely that it helps performance. (Whether it + ~ does is up to Intel's microcode.) + 8 packalign + current-offset L' warm-start set-label + + ~ Before handing off to us, cold-start pushed a single value onto the + ~ stack, a pointer to the beginning of the heap. Now, we load our entire + ~ Forth implementation onto that heap, beginning with the minimal set of + ~ words needed to define more words. We do this because we need variables as + ~ infrastructure so we can eventually have dynamic definitions. + ~ + ~ There's something non-obvious here: words implemented statically as + ~ part of the executable image can't contain things that vary at runtime. + ~ That means that even if these words tried to implement some sort of + ~ dynamic lookup, they would have no way to find the root of whatever + ~ dynamic data structure they use. Dynamism needs to be bootstrapped. + ~ + ~ In a more traditional C-style program, static code could look up + ~ variables based on fixed addresses that are the same on every run. + ~ Failing that, we could dedicate a register to it, though that's a + ~ considerable expense. We chose not to do either of those things, because + ~ we want the versatility that comes with not being picky about our + ~ address space: It allows us to contemplate future improvements such as + ~ ASLR, or embedding into other processes that impose their own addressing + ~ constraints, or even coexisting with multiple versions of ourselves. + ~ That choice does mean we have the hard version of this bootstrapping + ~ problem, and copying ourselves to the heap is how we solve it. + ~ + ~ We do have the heap address right now, though that won't last. In case + ~ it's unclear why not: keeping it on the stack would require all future + ~ references to walk the stack, and somehow know when they've reached the + ~ bottom. The stack is a good place to keep things with clearly delimited + ~ lifetimes and visibility, but when we want something to live for our + ~ entire program and be easy to find from any code within it, we need to + ~ do something else. Anyway, since we have the address, we can use it for + ~ the next little bit of setup. + ~ + ~ The first few words we define are our variables, which hardcode the + ~ addresses they will return - but since we're doing this at runtime, + ~ "hardcoding" can reflect where our heap is. This is the fundamental + ~ trick that makes the heap usable. + ~ + ~ One more thing to notice: We already allocated the backing stores of + ~ these variables, and populated their initial values, in _start. The + ~ words we're defining return those same addresses for the same backing + ~ stores. So, we have continuity: Stuff defined in terms of the + ~ variable-words we're defining now will interoperate with the stuff that + ~ we define in the "early" way, which includes those very words. Both the + ~ early code and the later code are dealing with the same data structures, + ~ they're just using a different technique to find them. + ~ + ~ This is the only hardcoding we need to do; by building on top of it, + ~ we will soon reach a point where the rest of the system can be defined + ~ within itself. + ~ TODO These need to, like, exist first. Also they need to be referenced + ~ as labels. + ~ dq early_heap, litstring, "heap", early_variable + ~ dq early_s0, litstring, "s0", early_variable + ~ dq early_r0, litstring, "r0", early_variable + ~ dq early_latest, litstring, "latest", early_variable + ~ dq early_here, litstring, "here", early_variable + ; diff --git a/interpret.e b/interpret.e index 235c98b..4517b25 100644 --- a/interpret.e +++ b/interpret.e @@ -4,8 +4,8 @@ ~ ~ The code in this file defines the basic syntax and semantics of Forth as ~ a text-based language. It's written in terms of the underlying executor, -~ which is implemented and explained in evoke.e. The execution model gives us -~ the concept of "words"; the control and value stacks; and the ability to +~ which is implemented and explained in execution.e. The execution model gives +~ us the concept of "words"; the control and value stacks; and the ability to ~ call things. It has nothing to say about text, only about the binary form of ~ the language. ~ @@ -59,8 +59,8 @@ ~ all set up to start appending contents to the new word by calling ",". ~ ~ There's a handy diagram of the entry header format under "quick -~ reference", in the description of the exeuction model in evoke.e. Create is -~ responsible for everything up to the codeword, not including it. +~ reference", in the description of the exeuction model in exeuction.e. Create +~ is responsible for everything up to the codeword, not including it. ~ ~ When a word is created in interpret mode using s" to provide a string ~ literal, the temporary space that s" uses is in the same place as the @@ -127,10 +127,34 @@ ~ Finally, we write our updated value of "here" back into the variable. here ! ; -~ , 0000001000018080 -~ self-codeword 00000010000180d0 -~ variable 0000001000018128 -~ allocate 00000010000181c8 +~ (value to append to current word-in-progress --) +: , here @ swap pack64 here ! ; + +: self-codeword here @ 8 + , ; + +~ (address for new variable word to create, name string --) +: variable + create + self-codeword + here @ + swap :rax mov-reg64-imm64 + :rax push-reg64 + next + 8 packalign + here ! ; + +~ Allocates bytes on the heap by incrementing the global "here" pointer. The +~ "here" pointer is kept aligned to an 8-byte boundary, regardless of the size +~ requested. +~ +~ This does not create dictionary entries, it's just a raw memory interface. +~ It's suitable for allocating data or scratch space. +: allocate + here @ dup + ~ (size, here value, here value) + 3roll + 8 packalign here ! ; + +~ TODO ~ buffer-physical-start 0000001000018240 ~ buffer-physical-length 0000001000018270 ~ buffer-logical-start 00000010000182c0 @@ -157,31 +181,174 @@ ~ accumulate-string 0000001000018fc8 ~ word 00000010000194a0 ~ find 00000010000195f0 -~ is-alphanumeric 0000001000019628 -~ generalized-digit-value 0000001000019850 -~ decode-generalized-digit 0000001000019970 -~ read-base-unsigned 0000001000019a58 -~ read-integer-unsigned 0000001000019cb8 -~ read-integer 0000001000019eb0 + +~ (character -- 1 for true or 0 for false) +: is-alphanumeric + ~ We don't have a character-literal syntax; the hex constants here are + ~ ASCII codes. + dup 0x30 stackhex > { drop 0 exit } if ~ Less than "0". + dup 0x39 >= { drop 1 exit } if ~ Less than or equal to "9". + dup 0x41 > { drop 0 exit } if ~ Less than "A". + dup 0x5a >= { drop 1 exit } if ~ Less than or equal to "Z". + dup 0x61 > { drop 0 exit } if ~ Less than "a". + dup 0x7a >= { drop 1 exit } if ~ Less than or equal to "z". + drop 0 ; ~ Greater than "z". + + +~ (character -- value) +: generalized-digit-value + ~ We don't have a character-literal syntax; the hex constants here are + ~ ASCII codes. + dup 0x61 <= { 0x61 - 10 + exit } if ~ lowercase "a" + dup 0x41 <= { 0x41 - 10 + exit } if ~ uppercase "a" + 0x30 - ; ~ digit "0" + + +~ (character, base +~ -- value (if successful), +~ error indicator (zero equals success)) +: decode-generalized-digit + swap dup is-alphanumeric { + ~ It's alphanumeric. + ~ (base, character) + generalized-digit-value + ~ (base, value) + dup 3roll + ~ (value, value, base) + > { + ~ It's in range. + ~ (value) + 0 exit } if + ~ It's out of range. + ~ (value) + drop 1 exit } if + ~ It's not alphanumeric. + drop drop 1 ; + + +~ (string pointer, base +~ -- result (if successful), +~ error indicator (zero equals success)) +: read-base-unsigned + swap + + ~ If the first byte is null, this is an error + unpack8 + ~ (numeric base, current point in string, character) + dup 0 = { drop drop drop 1 exit } if + + ~ Decode the first byte as a generalized digit in the base. + ~ (numeric base, current point in string, character) + ~ If the first byte is less than "0", this is an error. + 3roll dup 4 unroll + ~ (numeric base, current point in string, character, numeric base) + decode-generalized-digit { + ~ (numeric base, current point in string) + drop drop 1 exit } if + + ~ The first byte is a valid generalized digit in the appropriate base, so + ~ let's get started. + ~ (numeric base, current point in string, initial value) + swap + + { + ~ (numeric base, result so far, current point in string) + unpack8 dup 0 = { + ~ A null after the first character is valid, and indicates we're done. + drop drop swap drop 0 exit } if + + ~ Decode the latest byte as a generalized digit in the base. + ~ (numeric base, result so far, current point in string, latest byte) + 4 roll dup 5 unroll + ~ (numeric base, result so far, current point in string, character + ~ numeric base) + decode-generalized-digit { + ~ If the latest character is not a valid digit, that's an error. + ~ (numeric base, result so far, current point in string) + drop drop drop 1 exit } if + + ~ The latest character is valid, so incorporate it and loop. + ~ (numeric base, result so far, current point in string, latest value) + 3roll 4 roll dup 5 unroll * + swap + } forever ; + ~ (string pointer ~ -- result (if successful), ~ error indicator (zero equals success)) -: read-decimal - dup unpack8 lit 0 != 0branch [ 6 8 * , ] ~ TODO character literal minus - ~ This is the case where it's non-negative. +: read-integer-unsigned + ~ We don't have a character-literal syntax; the hex constants here are + ~ ASCII codes. + dup unpack8 0x30 != { ~ digit "0" + ~ This is the case where the leading digit is not a zero. + ~ (original string pointer, advanced string pointer) + drop 10 read-base-unsigned exit } if + + ~ This is the case where the leading digit is a zero. ~ (original string pointer, advanced string pointer) - drop 10 read-base-unsigned exit + unpack8 dup 0x78 = { ~ lowercase "x" + ~ (original string pointer, doubly advanced string pointer, character) + drop swap drop 16 read-base-unsigned exit } if + + dup 0x6f = { ~ lowercase "o" + ~ (original string pointer, doubly advanced string pointer, character) + drop swap drop 8 read-base-unsigned exit } if + + dup 0x62 = { ~ lowercase "b" + ~ (original string pointer, doubly advanced string pointer, character) + swap drop swap 2 read-base-unsigned exit } if + + ~ This is the case where the second character is something else. + ~ (original string pointer, doubly advanced string pointer, character) + drop drop 10 read-base-unsigned ; + + +~ (string pointer +~ -- result (if successful), +~ error indicator (zero equals success)) +: read-integer + ~ We don't have a character-literal syntax; this is ASCII for a hyphen. + dup unpack8 0x2d != { + ~ This is the case where it's non-negative. + ~ (original string pointer, advanced string pointer) + drop read-integer-unsigned exit + } if ~ This is the case where it's negative. ~ (original string pointer, advanced string pointer) - swap drop 10 read-base-unsigned + swap drop read-integer-unsigned ~ (result maybe, exit code) - dup 0branch [ 2 8 * , ] + dup { + ~ Failure + ~ (non-zero exit code) + exit + } if + + ~ Success + ~ (result, zero exit code) + swap -1 * swap ; + + +~ (string pointer +~ -- result (if successful), +~ error indicator (zero equals success)) +: read-decimal + ~ We don't have a character-literal syntax; this is ASCII for a hyphen. + dup unpack8 0x2d != { + ~ This is the case where it's non-negative. + ~ (original string pointer, advanced string pointer) + drop 10 read-base-unsigned exit + } if - ~ Failure - ~ (non-zero exit code) - exit + ~ This is the case where it's negative. + ~ (original string pointer, advanced string pointer) + swap drop 10 read-base-unsigned + ~ (result maybe, exit code) + dup { + ~ Failure + ~ (non-zero exit code) + exit + } if ~ Success ~ (result, zero exit code) @@ -198,10 +365,10 @@ ~ we give it a name. ~ TODO this is the "create" / "here" conflict thing -~ describe-compilation -~ ' interpreter-flags-storage describe -~ ' interpreter-flags describe -~ newline +describe-compilation +' interpreter-flags-storage describe +' interpreter-flags describe +newline ~ here @ hexdump ~ s" interpreter-flags-storage" stackhex create stackhex ~ make-immediate 0 , ~ ~ latest @ dup unhide-entry s" interpreter-flags" variable @@ -337,12 +504,13 @@ : quit { interpret } forever ; ~ quit -~ 4 5 + . : za 13 12 - . ; za -~ : ' word value@ find dropstring-with-result -~ interpreter-flags @ 1 & { literal } if ; make-immediate -~ ' za . newline -~ : piz ' za . newline ; piz -~ ~ ' interpret forget quit 2 3 * . -~ ' ' describe ' za describe ' piz describe +-0x10 newline . newline +4 5 + . : za 13 12 - . ; za +: ' word value@ find dropstring-with-result + interpreter-flags @ 1 & { literal } if ; make-immediate +' za . newline +: piz ' za . newline ; piz +~ ' interpret forget quit 2 3 * . +' ' describe ' za describe ' piz describe bye |