~ Execution support
~ ~~~~~~~~~~~~~~~~~
~
~   These macros are an important part of the execution model described in
~ execution.e. They're here, in this file, because they need to be statically
~ available via the label transform, so that the log-load transform can rely
~ on them.


~ Macro next
~ ~~~~~~~~~~
~
~   Include this inline at the end of a word implemented in machine-code.
~ Conceptually, it returns. What it actually does is do the next thing the
~ caller would do, which is call the next word from the caller's array of
~ word pointers.
~
~   This is a widespread technique in Forth implementation, referred to as
~ indirect threaded code. It's "threaded" in the sense that each word takes
~ responsibility for finishing up by following the notional thread through the
~ metaphorical labyrinth to figure out the next word that its caller wants to
~ run after it. In other words, control never directly returns to the parent,
~ it proceeds directly to the sibling.
~
~ Registers in:
~
~ * rsi points to the address of the word to execute
~
~ Registers out:
~
~ * rax points to the codeword in the contents of the word that was executed
~ * rsi points to the next word-address after this one
~
~ Flags
~ * DF = 0 is required
~
~ (base address -- new base address)
: pack-next
  ~ Copy the next word's address from *rsi into rax. Increment rsi (as per the
  ~ DF flag).
  lods64

  ~ Load the codeword from the word's contents, and jump to the interpreter it
  ~ points to.
  :rax jmp-abs-indirect-reg64 ;


~ Macro beforenext
~ ~~~~~~~~~~~~~~~~
~
~   Sometimes we want to transfer control from a word implemented in
~ machine-code to another word, without coming back after, as if we were
~ simply jumping to it. This is an innovation of ours; Jonesforth doesn't do
~ it. It is similar to the tail-call optimization that many Lisp dialects
~ have.
~
~   This implementation will work regardless of how the receiving word is
~ implemented. It impersonates the "next" snippet, setting up rax to point
~ to the codeword then jumping to the interpreter. Since it doesn't change
~ the control stack or rsi, when the receiving word eventually invokes
~ "next"; it will pick up in the same place as if this sending word had done
~ it.
~
~   Thus, notionally we are doing just this one transfer of control before
~ eventually getting around to inlining "next". Hence the name.
~
~ (target address, base address -- new base address)
: pack-beforenext
  ~ Do a permanent transfer of control by setting rax and invoking the
  ~ codeword. Of course, we could jump to docol ourselves but this will work
  ~ regardless of what the receiving codeword is.
  :rax mov-reg64-imm64
  :rax jmp-abs-indirect-reg64 ;


~ Macros pushcontrol
~        popcontrol
~ ~~~~~~~~~~~~~~~~~~
~
~   Include these inline to push an address onto the control stack, or pop
~ one off of it. You will recall the control stack is kept in rbp. The
~ parameter is given in a user-specified register.
~
~   Jonesforth's analogous macros are called PUSHRSP and POPRSP but I think
~ that's super confusing, since rsp is also the name of a register, but a
~ different one. I guess it was less confusing in 32-bit, since esp doesn't
~ start with an "r". Anyway, this has to be named something that
~ distinguishes it from Intel's PUSH and POP opcodes, so...
~
~   "Load effective address" is just a cute way to do arithmetic on a
~ register, here. To push or pop we decrement or increment rbp by 8. To
~ actually interact with the space in the stack, we indirect through rbp.
~
~ Registers in and out:
~
~ * rbp points to the top of the control stack.
~
~ (source register, base address -- new base address)
: pack-pushcontrol
  swap :rbp -8 :rbp lea-reg64-disp8-reg64
  swap :rbp 0 mov-disp8-reg64-reg64 ;

~ (target register, base address -- new base address)
: pack-popcontrol
  :rbp 0 3roll mov-reg64-disp8-reg64
  :rbp 8 :rbp lea-reg64-disp8-reg64 ;