diff options
| author | Irene Knapp <ireneista@irenes.space> | 2026-05-16 00:43:24 -0700 |
|---|---|---|
| committer | Irene Knapp <ireneista@irenes.space> | 2026-05-16 00:43:24 -0700 |
| commit | 9922f43d213bb2f710f90cb5165a7580ef0326cd (patch) | |
| tree | 8efe744025fb1673fa506293f385e96f12d7fb8d /transform.e | |
| parent | 09ae826c2f99b7e21606e01f577b934c1d669bdc (diff) | |
it's much closer, most of the labels work correctly now
Force-Push: yes Change-Id: I45662e60c0035758a2cd57d971031eb0562eccb7
Diffstat (limited to 'transform.e')
| -rw-r--r-- | transform.e | 118 |
1 files changed, 73 insertions, 45 deletions
diff --git a/transform.e b/transform.e index d727d53..e913d42 100644 --- a/transform.e +++ b/transform.e @@ -18,9 +18,12 @@ ~ literals, are looked up via the label facility. ~ ~ Since the label facility is able to resolve forward references, there is -~ no hard requirement that words be topologically sorted, but forward -~ references should still be kept to a minimum, since that's a significant -~ difference from un-transformed code that could easily become confusing. +~ no hard requirement that everything in the file be topologically sorted. +~ However, the transform will refuse to create forward references to compiled +~ words. If you want them, you can create them by hand by calling use-label +~ yourself. This restriction is in place because allowing forward references +~ would be a significant difference from un-transformed code that could easily +~ become confusing, and because it simplifies the implementation a bit. ~ ~ Compilation words do make extensive reference to the global variables ~ "here" and "latest". In particular, flow-control words such as if-else @@ -40,15 +43,7 @@ ~ for not doing anything else that would rely on "here" and "latest" matching ~ their runtime addresses, though it is otherwise allowed to modify and rely ~ on them in all the usual ways. The alternate versions are defined in this -~ file as their own words, "Lcreate", "L:", "L;", and "L;asm". The alternates -~ rely on various labels, all of which must be defined elsewhere: -~ -~ * origin -~ * docol -~ * exit -~ * : -~ * ; -~ * ;asm +~ file as their own words, "Lcreate", "L:", "L;", and "L;asm". ~ ~ Note that these alternates are applied via a purely lexical ~ transformation: when a word would be looked up in the dictionary to @@ -60,6 +55,10 @@ ~ the rest of Evocation. There's no need to keep it separate like there is ~ with the other variables. This makes it easy to change modes. ~ +~ The transformation and the alternates rely on various labels, all of which +~ must be defined elsewhere, lest the label loop fail to converge: "lit", +~ "origin", "docol", "exit", ":", ";", and ";asm". +~ ~ All of these limitations result in the compiled code being, in effect, ~ written in a dialect which is like Evocation, but more restricted. This is ~ acceptable, because the label transform is intended for compiling code that @@ -180,6 +179,7 @@ allocate-transform-state s" transform-state" variable dup stringlen 1 + dup 3unroll here @ 10 + 3unroll memmove here @ + ~ This value of "latest" is going into the generated output, so call ~ transform-offset on it first. latest @ transform-offset pack64 @@ -188,6 +188,14 @@ allocate-transform-state s" transform-state" variable + 8 packalign here @ latest ! + + ~ Now we're immediately after the word header, which is where the codeword + ~ will be. This is the value the label should taken on, so we set it. + dup here @ 10 + + swap-transform-variables + intern-label set-label + swap-transform-variables + here ! ; @@ -282,23 +290,6 @@ allocate-transform-state s" transform-state" variable 3roll drop swap } unless ~ (name as stack string, immediate entry pointer, name pointer) - ~ For compile mode, we need to look the word up in the output buffer. We - ~ can't easily traverse the next-entry pointers in the output buffer's - ~ dictionary, so we use the label. - ~ - ~ Labels point to codewords (because that's what "Lcreate" does), so we - ~ have to convert it to get the entry pointer. Since we don't know the - ~ word's name statically, this is a rare scenario where we can't use the - ~ abbreviated label syntax, but that's easy enough. - ~ - ~ We do have to be careful of one thing: On the first run, the label may - ~ be zero! - swap-transform-variables - intern-label use-label - swap-transform-variables - dup { execution-token-to-entry } if - ~ (name as stack string, immediate entry pointer, compiled entry pointer) - ~ In regular "interpret", we would check whether we found the word before ~ checking the mode. However, we have three different places words could ~ come from, so that's not a simple notion. So, we check the mode first. @@ -309,30 +300,67 @@ allocate-transform-state s" transform-state" variable ~ override an immediate word with a non-immediate word of the same name. over dup { entry-flags@ 0x01 & not } if - ~ Either there was no immediate entry, or the immediate entry wasn't - ~ flagged as an immediate word. So we treat this as a compilation, which - ~ means we append a word to the heap. Specificaly, of course, we use the - ~ compiled entry to do that. - { swap drop dropstring-with-result - entry-to-execution-token , - 0 exit } if + { + ~ Either there was no immediate entry, or the immediate entry wasn't + ~ flagged as an immediate word. So we check whether this could be a + ~ compilation. + ~ + ~ To do this, we need to look the word up in the output buffer. We + ~ can't easily traverse the next-entry pointers in the output buffer's + ~ dictionary, so we check the label. Since we don't know the word's name + ~ statically, this is a rare scenario where we can't use the abbreviated + ~ label syntax, but that's easy enough. + ~ + ~ Even though we've ruled out the possibility that the word is only + ~ ever used immediately, it is still possible that there's some reason + ~ the word doesn't exist. In particular, it could be an integer literal. + ~ If we were to call use-label first, that would count as a requirement + ~ that the label must eventually be set. We don't want to require that + ~ quite yet, so we call find-label. + ~ + ~ This check is the means by which forward references are disallowed: + ~ On the very first pass, a forward-referenced label won't exist yet, so + ~ transform will give a "no such word" error, which in an ideal world + ~ would prevent there from being a subsequent pass, but at the very + ~ least it will ensure the output isn't a valid ELF. + dup + swap-transform-variables + find-label + swap-transform-variables + { + ~ It exists, so we declare our use of it (that's also the only way to + ~ get a value for it). + swap-transform-variables + intern-label use-label + swap-transform-variables + + ~ Labels point to codewords (because that's what "Lcreate" does), + ~ which is already what we want to output. + ~ + ~ An important caveat: Though it would require something weird to be + ~ happening, such as a forced forward reference, the label may be zero! + ~ We need to allow for that possibility by not examining the contents of + ~ a nonexistent entry. + ~ + ~ Fortunately we don't have to look at it, just append it to the heap + ~ and clean up. + drop , dropstring 0 exit + } if + } if } if + ~ (name as stack string, immediate entry pointer, name pointer) ~ If we got here, one of three things is true: We're in interpret mode; - ~ the word is immediate; or no word was found. Regardless, we don't need - ~ the compiled entry pointer anymore, so drop it. - drop - ~ (name as stack string, immediate entry pointer) - - ~ If the immediate entry pointer is non-zero, run it. - dup { - dropstring-with-result entry-to-execution-token execute + ~ the word is immediate; or no word was found. If the immediate entry + ~ pointer is non-zero, run it. + over { + drop dropstring-with-result entry-to-execution-token execute 0 exit } if ~ If we're still here, it wasn't in the dictionary. Also, we don't need ~ the immediate entry pointer, either. - drop + drop drop ~ (name as stack string) ~ If it's not in the dictionary, check whether it's an integer literal. As |