diff options
| author | Irene Knapp <ireneista@irenes.space> | 2026-05-17 17:29:09 -0700 |
|---|---|---|
| committer | Irene Knapp <ireneista@irenes.space> | 2026-05-17 17:29:09 -0700 |
| commit | 3b41dbfa2338c11dd8398026c00922f20f32dc81 (patch) | |
| tree | 1761f2b8f2da972b830cf617d194c7c5be5b1535 /core.e | |
| parent | 4f1a07da9c87a1560da34b8a96a9de4cdc90f1fc (diff) | |
implement log-load-create; add all the pack/unpack stuff to core.e
Force-Push: yes Change-Id: I04dd65a9eec71f9b50c8875bdcbe5d4be59888d5
Diffstat (limited to 'core.e')
| -rw-r--r-- | core.e | 141 |
1 files changed, 141 insertions, 0 deletions
diff --git a/core.e b/core.e index 9393a04..b2062d1 100644 --- a/core.e +++ b/core.e @@ -1,3 +1,15 @@ +~ ~~~~~~~~~~~~~~~~~~~~~~~~~ +~ ~~ Core Forth features ~~ +~ ~~~~~~~~~~~~~~~~~~~~~~~~~ +~ +~ This file provides extremely fundamental functionality which is a +~ necessary component of any Forth dialect, including Evocation. It is +~ included statically as part of any generated executable, and a second copy +~ of it is later copied into the log when that executable runs. Therefore, it +~ is written to obey the constraints of both the label transform, and the +~ log-load transform; see transform.e for more details on that. +~ +~ ~ Stack manipulation routines ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ @@ -955,3 +967,132 @@ :rax jmp-abs-indirect-reg64 here ! ] ;asm + +~ Dictionary entries +~ ~~~~~~~~~~~~~~~~~~ +~ +~ Now, we have a bunch of words that are used for traversing the Forth +~ core data structures that describe words. First, we have a couple that +~ relate to individual words and their pieces... +~ +~ The log-load transform produces code that requires +~ entry-to-execution-token, which means it's needed statically. So this stuff +~ to deal with word entry headers might as well go in core, since it has no +~ dependencies to speak of. +~ +~ These are the first words in core that are implemented in Forth rather +~ than assembler. That's not as big a deal as it may seem; the Forth execution +~ model has been ready-to-go ever since we implemented docol and exit, and +~ at this point we have enough basics to do useful things with it. + +~ Jonesforth calls this "TFCA" and ">CFA"; its author speculates that the +~ original meaning is "code field address". +~ +~ (entry pointer -- execution token) +: entry-to-execution-token + ~ Skip next-entry pointer, flag byte, and start terminator. + 10 + + ~ Skip string contents. + dup stringlen + + ~ Skip one for the null terminator, seven more for alignment. + 8 + + ~ Zero the low bits and now it's aligned. + 7 invert & ; + +~ Jonesforth calls this "CFA>". Jonesforth's implementation searches the +~ entire dictionary, since its word header format isn't designed to be +~ traversed in reverse, but ours is, so it should be fast. +~ +~ (execution token -- entry pointer) +: execution-token-to-entry + 1 - + dup reverse-padding-len - + dup reverse-stringlen - + 9 - ; + +~ (entry pointer -- flags byte) +: entry-flags@ + 8 + @ 0xFF & ; + +~ TODO these parameters are in a counterintuitive order, swap them +~ (entry pointer, new flags byte --) +: entry-flags! + swap + 8 + + dup @ 3roll + 0xFF & + swap 0xFFFFFFFFFFFFFF00 & | + swap ! + ; + +~ (entry pointer -- name string pointer) +: entry-to-name 10 + ; + + +~ Binary packing +~ ~~~~~~~~~~~~~~ +~ +~ These routines are for building up data structures in-memory. Sometimes +~ they're used for structures that are meant to stay in memory; other times +~ it's a buffer that will become output. +~ +~ The general pattern is that each routine takes an output address and +~ some specific datum, and returns the output address adjusted to point +~ after the new datum. That makes them easy to chain together. We call this +~ address the "output point", to capture the idea that it's a running total +~ which gets updated by each new datum as it's packed. + +~ (output point, value -- output point) +: pack64 swap dup 3unroll ! 8 + ; +: pack32 swap dup 3unroll 32! 4 + ; +: pack16 swap dup 3unroll 16! 2 + ; +: pack8 swap dup 3unroll 8! 1 + ; + +~ This works on C-style strings, which are characters followed by a null +~ terminator. The packed data includes the null terminator. +~ +~ (output point, string pointer -- output point) +: packstring + dup stringlen 1 + dup + ~ (output point, source, length, length) + 4 roll dup 5 unroll + ~ (destination, source, length, length, output point) + + 4 unroll + ~ (output point, destination, source, length) + memcopy ; + +~ (output point, alignment byte count -- output point) +: packalign + { 2dup /% drop { drop exit } unless + swap 0 pack8 swap } forever ; + + +~ Binary unpacking +~ ~~~~~~~~~~~~~~~~ +~ +~ These routines are for examining data structures in-memory. +~ +~ Similarly to the output routines, each routine takes an input address, +~ which it updates to point after the data item being read. We call this the +~ "input point". Since this is input, the routines return data items rather +~ than accepting them. + +~ (input point -- input point, value) +: unpack64 dup @ swap 8 + swap ; +: unpack32 dup 32@ swap 4 + swap ; +: unpack16 dup 16@ swap 2 + swap ; +: unpack8 dup 8@ swap 1 + swap ; + +~ TODO does this need to have a separate name? +~ (proposed size, alignment byte count -- adjusted size) +: align-size + dup 3unroll dup 3unroll + ~ (alignment, alignment, proposed size, alignment) + 1 - + swap /% swap drop * ; + +~ You might think this would be identical to packalign, but packalign has +~ side effects. +~ +~ (input point, alignment byte count -- input point) +: unpackalign align-size ; + |