implement "create" in interpret.e

memmove never had a heap-based implementation, probably due to excessive hurrying when it was first written. it has now been added. it's identical to the original static implementation. a minor documentation error in evoke.e was discovered while coding this, and is now fixed. Force-Push: yes Change-Id: I55ff09712a992410167a53cd83e96a452ed49744
author: Irene Knapp <ireneista@irenes.space> 2026-05-09 20:29:14 -0700
committer: Irene Knapp <ireneista@irenes.space> 2026-05-09 20:29:14 -0700
commit: 51e63deb372598185562858dbcb744114e2b1359 (patch)
tree: b01758a702a4d2dfce617dba42c1e7ec375b8866
parent: 6aded817e2ed13143db15040d88e86d0649f4e85 (diff)
2 files changed, 96 insertions, 15 deletions
diff --git a/interpret.e b/interpret.e
index 2bb9e43..235c98b 100644
--- a/interpret.e
+++ b/interpret.e
@@ -50,23 +50,83 @@
   ."   name of latest: " latest @ entry-to-name emitstring newline
   newline ;
 
-~ TODO this is identical to the flatassembler version, but it needs to fix the
-~ conflict with s"
+~   Allocate space by incrementing "here", and output a word entry header in
+~ it. Also add it to the "latest" linked list. Use zero as the flag values;
+~ accept a string pointer on the stack and use its contents as the name.
+~
+~   This is the first step of creating a new word. Its responsibility includes
+~ everything up to the codeword, not including the codeword; it leaves things
+~ all set up to start appending contents to the new word by calling ",".
+~
+~   There's a handy diagram of the entry header format under "quick
+~ reference", in the description of the exeuction model in evoke.e. Create is
+~ responsible for everything up to the codeword, not including it.
+~
+~   When a word is created in interpret mode using s" to provide a string
+~ literal, the temporary space that s" uses is in the same place as the
+~ entry header we're going to write out. It really is very useful to have
+~ that work. Fortunately, it does! We're able to avoid needing a special case
+~ by doing things in a very careful way, as described below.
+~
+~ (string pointer --)
 : create
+  ~   We add one to the string length in order to include the trailing null
+  ~ terminator. This will be the length of our name field; we save an extra
+  ~ copy of it to help with packing later.
+  dup stringlen 1 + dup 3unroll
+  ~ (name field length, string pointer, name field length)
+
+  ~   We use memmove to put the string in its final position, because it works
+  ~ correctly when the destination overlaps with the source. Notice that we
+  ~ do this before writing anything else in the entry header, to avoid
+  ~ stepping on it. The name string always starts ten bytes into the header,
+  ~ so we can use a fixed offset.
+  here @ 10 + 3unroll memmove
+  ~ (name field length)
+
+  ~   Now we can get back to the fields that belong at the start of the entry
+  ~ header. We take the value of "here" and keep a working copy of it on the
+  ~ stack, which we'll advance every time we write more bytes.
   here @
+  ~ (name field length, updated "here" pointer)
+
+  ~   Pack the old value of "latest" as the first field of the header, linking
+  ~ from the newly-defined word to the next-newest word.
+  ~
+  ~   All the entries form a linked list, from newest to oldest. Since the
+  ~ link is the first field in the entry header, you can get from each entry
+  ~ to the one before it just by dereferencing the entry pointer.
   latest @ pack64
+
+  ~   This is the flags byte. It starts at zero; our caller can change it if
+  ~ desired.
   0 pack8
+
+  ~   This is the "other" null terminator, used when traversing the name
+  ~ string backwards for execution-token-to-entry. Yes, the name is
+  ~ null-terminated at both ends.
   0 pack8
-  swap packstring
+
+  + ~ The name field is already populated, so just skip past it.
+  ~ (updated "here" pointer)
+
+  ~   The codeword is aligned to a machine-word boundary, and the padding for
+  ~ it is create's responsibility.
+  ~
+  ~   By adding the null terminator before adding alignment padding, we've
+  ~ made sure there's always at least one null byte. Otherwise we'd be missing
+  ~ the terminator if by chance the name were exactly the wrong length.
   8 packalign
+  ~ (updated "here" pointer)
+
+  ~   Retrieve the value of "here", which still doesn't reflect our additions,
+  ~ and store it at the adddress of "latest". It's the start of our
+  ~ newly-defined word, which makes it the latest word.
   here @ latest !
-  here ! ;
 
-latest @ describe
-s" foo" create
-latest @ describe
+  ~   Finally, we write our updated value of "here" back into the variable.
+  here ! ;
 
-~ create                                                0000001000017fa8
 ~ ,                                                     0000001000018080
 ~ self-codeword                                         00000010000180d0
 ~ variable                                              0000001000018128
diff --git a/quine.asm b/quine.asm
index e86e61c..12a45f8 100644
--- a/quine.asm
+++ b/quine.asm
@@ -2459,6 +2459,27 @@ cold_start:
   dq rdx, rsi, mov_reg64_reg64
   dq pack_next, lit, 8, packalign, early_here_store
 
+  dq litstring, "memmove", early_create, early_self_codeword
+  dq early_here, fetch
+  dq rsi, rdx, mov_reg64_reg64
+  dq rcx, pop_reg64
+  dq rsi, pop_reg64
+  dq rdi, pop_reg64
+  dq rsi, rax, mov_reg64_reg64
+  dq rdi, rax, cmp_reg64_reg64
+  dq lit, 4, cc_below, jmp_cc_rel_imm8
+  dq rep_movs8
+  dq lit, 16, jmp_rel_imm8
+  dq rcx, rsi, add_reg64_reg64
+  dq rsi, dec_reg64
+  dq rcx, rdi, add_reg64_reg64
+  dq rdi, dec_reg64
+  dq std
+  dq rep_movs8
+  dq cld
+  dq rdx, rsi, mov_reg64_reg64
+  dq pack_next, lit, 8, packalign, early_here_store
+
   dq litstring, "stringlen", early_create, early_self_codeword
   dq early_here, fetch
   dq rdi, pop_reg64
@@ -11790,14 +11811,14 @@ defword boot_source, 0x40
   ; We'll have logical and real soon now, be patient... :)
   dq ": is-in-heap dup heap @ <= swap here @ > & ;                    "
 
-  dq ": unlink-pre-heap-words                                         "
-  dq "  latest @                                                      "
-  dq "  dup { dup is-in-heap 0 = { drop exit } if }                   "
-  dq "      { drop exit } if-else                                     "
-  dq "  { dup @ is-in-heap } { @ } while                              "
-  dq "  0 swap ! ;                                                    "
+;  dq ": unlink-pre-heap-words                                         "
+;  dq "  latest @                                                      "
+;  dq "  dup { dup is-in-heap 0 = { drop exit } if }                   "
+;  dq "      { drop exit } if-else                                     "
+;  dq "  { dup @ is-in-heap } { @ } while                              "
+;  dq "  0 swap ! ;                                                    "
   ; Do it immediately, so that we don't accidentally rely on any of that.
-  dq "unlink-pre-heap-words                                           "
+  ;dq "unlink-pre-heap-words                                           "
 
   ; Now some fancier stack combinators.
   ;
author	Irene Knapp <ireneista@irenes.space>	2026-05-09 20:29:14 -0700
committer	Irene Knapp <ireneista@irenes.space>	2026-05-09 20:29:14 -0700
commit	51e63deb372598185562858dbcb744114e2b1359 (patch)
tree	b01758a702a4d2dfce617dba42c1e7ec375b8866
parent	6aded817e2ed13143db15040d88e86d0649f4e85 (diff)