summary refs log tree commit diff
path: root/interpret.e
diff options
context:
space:
mode:
Diffstat (limited to 'interpret.e')
-rw-r--r--interpret.e297
1 files changed, 20 insertions, 277 deletions
diff --git a/interpret.e b/interpret.e
index 7db86b7..d2bf10f 100644
--- a/interpret.e
+++ b/interpret.e
@@ -38,291 +38,37 @@
 ~   It is primarily concerned with managing the contents of an area of memory
 ~ we call the "log"; see log-load.e for more detail on terminology.
 
-~ TODO find a better place for this
-: describe-compilation
-  ~ It's always in progress ;) We just need a header like this so it doesn't
-  ~ get confused with other kinds of debug output.
-  ." compilation in progress" newline
-  latest @ hexdump
-  newline
-  ."   here " here @ .hex64 newline
-  ."   latest " latest @ .hex64 newline
-  ."   name of latest: " latest @ entry-to-name emitstring newline
-  newline ;
-
-
-~   Allocate space by incrementing "here", and output a word entry header in
-~ it. Also add it to the "latest" linked list. Use zero as the flag values;
-~ accept a string pointer on the stack and use its contents as the name.
-~
-~   This is the first step of creating a new word. Its responsibility includes
-~ everything up to the codeword, not including the codeword; it leaves things
-~ all set up to start appending contents to the new word by calling ",".
-~
-~   There's a handy diagram of the entry header format under "quick
-~ reference", in the description of the exeuction model in exeuction.e. Create
-~ is responsible for everything up to the codeword, not including it.
-~
-~   When a word is created in interpret mode using s" to provide a string
-~ literal, the temporary space that s" uses is in the same place as the
-~ entry header we're going to write out. It really is very useful to have
-~ that work. Fortunately, it does! We're able to avoid needing a special case
-~ by doing things in a very careful way, as described below.
-~
-~ (string pointer --)
-: create
-  ~   We add one to the string length in order to include the trailing null
-  ~ terminator. This will be the length of our name field; we save an extra
-  ~ copy of it to help with packing later.
-  dup stringlen 1 + dup 3unroll
-  ~ (name field length, string pointer, name field length)
-
-  ~   We use memmove to put the string in its final position, because it works
-  ~ correctly when the destination overlaps with the source. Notice that we
-  ~ do this before writing anything else in the entry header, to avoid
-  ~ stepping on it. The name string always starts ten bytes into the header,
-  ~ so we can use a fixed offset.
-  here @ 10 + 3unroll memmove
-  ~ (name field length)
-
-  ~   Now we can get back to the fields that belong at the start of the entry
-  ~ header. We take the value of "here" and keep a working copy of it on the
-  ~ stack, which we'll advance every time we write more bytes.
-  here @
-  ~ (name field length, updated "here" pointer)
-
-  ~   Pack the old value of "latest" as the first field of the header, linking
-  ~ from the newly-defined word to the next-newest word.
-  ~
-  ~   All the entries form a linked list, from newest to oldest. Since the
-  ~ link is the first field in the entry header, you can get from each entry
-  ~ to the one before it just by dereferencing the entry pointer.
-  latest @ pack64
-
-  ~   This is the flags byte. It starts at zero; our caller can change it if
-  ~ desired.
-  0 pack8
-
-  ~   This is the "other" null terminator, used when traversing the name
-  ~ string backwards for execution-token-to-entry. Yes, the name is
-  ~ null-terminated at both ends.
-  0 pack8
-
-  + ~ The name field is already populated, so just skip past it.
-  ~ (updated "here" pointer)
-
-  ~   The codeword is aligned to a machine-word boundary, and the padding for
-  ~ it is create's responsibility.
-  ~
-  ~   By adding the null terminator before adding alignment padding, we've
-  ~ made sure there's always at least one null byte. Otherwise we'd be missing
-  ~ the terminator if by chance the name were exactly the wrong length.
-  8 packalign
-  ~ (updated "here" pointer)
-
-  ~   Retrieve the value of "here", which still doesn't reflect our additions,
-  ~ and store it at the adddress of "latest". It's the start of our
-  ~ newly-defined word, which makes it the latest word.
-  here @ latest !
-
-  ~   Finally, we write our updated value of "here" back into the variable.
-  here ! ;
-
-~ (value to append to current word-in-progress --)
-: , here @ swap pack64 here ! ;
-
-
-: self-codeword here @ 8 + , ;
-
-
-~   A variable is simply a word that returns a specific address, always the
-~ same one, at which a value can be stored. This word "variable" takes and
-~ address and a word name, and defines the word. Allocating space is its
-~ caller's responsibility.
-~
-~ TODO the address is constant but the contents vary, confusing, write it up
-~
-~ (address for new variable word to point to, string pointer --)
-: variable
-  create
-  self-codeword
-  here @
-  swap :rax mov-reg64-imm64
-  :rax push-reg64
-  pack-next
-  8 packalign
-  here ! ;
-
-
-~   A keyword is a word that evaluates to its own address, which makes it
-~ suitable for use as a constant. By convention, all our keywords have names
-~ starting with a colon, which imitates the way they work in Common Lisp.
-~
-~   Specifically, it returns its own execution token. Thus, executing its
-~ result repeatedly will keep giving the same value. We aren't in the habit of
-~ doing quote-exec kinds of things in Evocation, but it seems as good as any
-~ other unique value, so we might as well.
-~
-~   Unlike CL, we don't currently have the lexer automatically create keywords
-~ for us; we create them explicitly. That's likely to be added at some point,
-~ but at the moment the feature is lying fallow to see whether it winds up
-~ seeing a lot of use.
-~
-~ (string pointer --)
-: keyword
-  create
-
-  ~   Before outputting our codeword, save a copy of the address where it's
-  ~ going to be. That will be the execution token we return.
-  here @ dup
-  ~ (self execution token, output point)
-
-  ~   Now add a codeword. This is an assembly word, so it's a self-codeword,
-  ~ meaning it points to the word right after itself.
-  dup 8 + pack64
-  ~ (self execution token, output point)
-
-  ~ Now we consume the execution token, using it as part of this instruction.
-  :rax mov-reg64-imm64
-  ~ (output point)
-
-  ~ To return it, we push it to the stack.
-  :rax push-reg64
-
-  ~ Now just the normal stuff every assembly word ends with.
-  pack-next
-  8 packalign
-
-  here ! ;
-
-
-~   Allocates bytes on the heap by incrementing the global "here" pointer. The
-~ "here" pointer is kept aligned to an 8-byte boundary, regardless of the size
-~ requested.
-~
-~   This does not create dictionary entries, it's just a raw memory interface.
-~ It's suitable for allocating data or scratch space.
-: allocate
-  here @ dup
-  ~ (size, here value, here value)
-  3roll + 8 packalign here ! ;
-
 : hide-entry dup entry-flags@ 0x80 | entry-flags! ;
 
 : unhide-entry dup entry-flags@ 0x80 invert & entry-flags! ;
 
 
-~ (pointer to buffer metadata -- pointer to buffer "physical-start" field)
-: buffer-physical-start ;
-  ~ The physical-start field happens to be the first thing in the metadata, so
-  ~ this is an nop, but it still exists as a word because having it reduces
-  ~ confusion.
-~ (pointer to buffer metadata -- pointer to buffer "physical-length" field)
-: buffer-physical-length 8 + ;
-~ (pointer to buffer metadata -- pointer to buffer "logical-start" field)
-: buffer-logical-start 2 8 * + ;
-~ (pointer to buffer metadata -- pointer to buffer "logical-length" field)
-: buffer-logical-length 3 8 * + ;
-~ (pointer to input buffer metadata -- pointer to input buffer "refill" field)
-: input-buffer-refill 4 8 * + ;
-~ (pointer to input buffer metadata
-~  -- pointer to input buffer "next-source" field)
-: input-buffer-next-source 5 8 * + ;
-
-~   Given an initialized buffer (input or otherwise), sets its logical-start
-~ and logical-length fields to indicate the buffer is empty. This relies on
-~ the buffer having a backing store attached, but does not alter the backing
-~ store or its contents.
-~
-~ (pointer to buffer metadata --)
-: clear-buffer
-  dup buffer-physical-start @ swap
-  ~ (address of backing store, metadata pointer)
-  dup 3unroll
-  ~ (metadata pointer, address of backing store, metadata pointer)
-  buffer-logical-start !
-  buffer-logical-length 0 swap ! ;
-
-
-~   Sets all fields in an input buffer metadata structure to zero,
-~ effectively detaching and leaking any backing store that had been attached
-~ to it. Suitable for use during initialization.
-~
-~ (pointer to input buffer metadata --)
-: zero-input-buffer-metadata
-  dup buffer-physical-start 0 swap !
-  dup buffer-physical-length 0 swap !
-  dup buffer-logical-start 0 swap !
-  dup buffer-logical-length 0 swap !
-  dup input-buffer-refill 0 swap !
-  ~ Notice the absence of a dup this time.
-  input-buffer-next-source 0 swap ! ;
-
-
-~   Allocates input-buffer metadata, with no backing store attached.
-~ Initializes the metadata to all zeroes.
-~
-~ (-- pointer to input buffer metadata)
-: allocate-input-buffer-metadata
-  6 8 * allocate
-  dup zero-input-buffer-metadata ;
-
-
-~   Allocates input buffer metadata and a backing store, in one operation.
-~ Points the metadata to the backing store.
-~
-~ (buffer capacity in bytes -- pointer to input buffer metadata)
-: allocate-input-buffer
-  dup 6 8 * + allocate
-  dup zero-input-buffer-metadata
-  ~ (capacity in bytes, metadata pointer)
-  dup dup 6 8 * +
-  ~ (capacity in bytes, metadata pointer, metadata pointer, physical start)
-  swap buffer-physical-start !
-  ~ (capacity in bytes, metadata pointer)
-  dup 3unroll buffer-physical-length !
-  ~ (metadata pointer)
-  dup clear-buffer ;
-
-
-~   Sets the backing store of an input buffer to point at a null-teriminated
-~ string and read from it.
-~
-~ (buffer metadata pointer, string pointer --)
-: attach-string-to-input-buffer
-  swap
-  ~ (string pointer, metadata pointer)
-  2dup buffer-physical-start !
-  ~ (string pointer, metadata pointer)
-  2dup buffer-logical-start !
-  ~ (string pointer, metadata pointer)
-  swap stringlen swap
-  ~ (string length, metadata pointer)
-  2dup buffer-physical-length !
-  ~ (string length, metadata pointer)
-  buffer-logical-length ! ;
-
-
 ~ TODO
-~ main-input-buffer-metadata                            0000001000018738 I raw
-~ main-input-buffer                                     0000001000018788 asm
-~ consume-from                                          00000010000187c0
-~ peek-from                                             0000001000018960
-~ key-from                                              0000001000018ab8
-~ is-space                                              0000001000018b00
-~ peek                                                  0000001000018d20
-~ consume                                               0000001000018d50
-~ key                                                   0000001000018d88
 ~ unroll-past-string                                    0000001000018db8
 ~ swap-past-string                                      0000001000018ea0
 ~ dropstring                                            0000001000018ee8
 ~ dropstring-with-result                                0000001000018f80
 ~ accumulate-string                                     0000001000018fc8
+~ is-space                                              0000001000018b00
 ~ word                                                  00000010000194a0
 
-
-~ (string pointer -- entry pointer or 0)
-: find latest swap find-in ;
+~   The word "'", often pronounced "tick", quotes the following word, looking
+~ it up and treating it as a constant. In immediate mode, the constant winds
+~ up on the stack; in compile mode it gets compiled.
+~
+~   There are a few possible implementation strategies here. Running as an
+~ immediate word means there's a clear and unambiguous concept of "the
+~ following word", so that's what we do; otherwise we'd have to get clever
+~ about somehow finding out where we were called from. That means we take on
+~ what would otherwise be the interpreter's responsibility, of checking what
+~ mode we're in. Happily, that's easy to do.
+~
+~   There's a cyclic dependency where "if" relies on "'", and "'" relies on
+~ "if". Fortunately both of them are treated as alternates by the log-load
+~ transform, so we don't have to worry about it.
+: ' word value@ find dropstring-with-result
+  interpreter-flags @ 1 & { literal } if
+  ; make-immediate
 
 
 ~ (character -- 1 for true or 0 for false)
@@ -506,11 +252,8 @@
 ~ harder for words that attempt to work with the contents of other words. So
 ~ we give it a name.
 
-s" interpreter-flags-storage" create make-immediate
-latest @ unhide-entry
-here @
-0 ,
-s" interpreter-flags" variable
+s" interpreter-flags-storage" create make-immediate make-visible
+here @ 0 , s" interpreter-flags" variable
 
 ~   There's an important bootstrapping concern: If you're loading this
 ~ interpreter into a running Evocation, it's important to not use the wrong