~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ ~~ Machine label facility ~~ ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Compilers and assemblers always have a need to resolve symbolic names for ~ parts of their code to numeric values. Usually, these symbolic names are ~ called labels. ~ ~ It's a surprisingly deep problem. The output itself can end up depending ~ on the actual values, for a variety of reasons. For example, the size of ~ a relative-jump instruction may vary depending on how far away its target ~ is, which changes the position of everything after. Or, there may be a size ~ limit on some particular segment and code may have to be entirely ~ reorganized to make things fit, splitting it into pieces or adding ~ trampolines. ~ ~ Often, this winds up being a hidden layer with immense complexity, ~ implemented as part of a linker, that even compiler maintainers find obscure ~ and confusing. ~ ~ It's Irenes' position that we should engage with complexity rather than ~ encysting it. So, in Evocation we take a different approach, heavily ~ inspired by the semantics of a tool called flatassembler. We provide ~ downstream code with the basic operations set-label and use-label which it ~ can combine in whatever order it wants. We also provide a harness, ~ label-loop, which takes an execution token for the downstream code ~ generator and runs it over and over, computing the mathematical fixed point ~ of the label assignments. ~ ~ On the first pass, label-loop guesses a value of zero for any label that's ~ used before it's set. On subsequent passes, each label starts with the value ~ it had on the previous pass. When we have a pass where labels end with the ~ same value they started with, we say they've converged, and we announce ~ success. If a hundred passes go by without convergence, we fail instead. ~ ~ Most of the time, the shorthand words L@' and L!' will be all you need to ~ use from your own code. ~ TODO this should go somewhere else :) ~ (current output point, string pointer --) : pack-raw-string { unpack8 dup } { 3roll swap pack8 swap } while drop drop ; ~ The labels data structure is a linked list of dictionary entries, with ~ the same header format as the main Evocation dictionary, but instead of ~ being executable, each holds two words of data: status bits, and a value, ~ in that order. There's no "docol" pointer or anything of the sort. ~ ~ Status is a bit field: ~ bit zero is whether it's been used ~ bit one is whether it's been defined ~ bit two is whether it was used before being defined ~ bit three is whether the guessed value wound up equaling the actual value ~ ~ Just as the Evocation dictionary uses the global variable "latest" as a ~ handle (a pointer to a pointer) beginning a linked list of entries, so the ~ label dictionary also uses a handle variable, named "labels". ~ TODO we should just do this in immediate mode, but right now the word ~ "variable" steps on the same scratch space that s" uses, so we can't. : init-labels 8 allocate s" labels" variable 0 s" labels" find entry-to-execution-token execute ! ; ~ This needs to happen now because otherwise the word "labels" won't exist. init-labels ~ This is analogous to word-heading, but prints label information. ~ (entry pointer --) : label-heading dup entry-to-name dup emitstring space stringlen 1+ 54 swap - 0 max indent entry-to-execution-token dup 8 + @ .hex64 space @ .hex8 newline ; ~ TODO this should go elsewhere ~ (dictionary handle) : oldest-entry-in dup dup { { dup @ } { @ } while } if dup 3roll = { drop 0 } if ; ~ TODO this should go elsewhere ~ (entry pointer, dictionary handle) : next-newer-entry-in @ 2dup = { 2drop 0 exit } if { dup { 2dup @ != } if } { @ } while swap drop ; ~ This prints the headings for all the labels that have been created. Note ~ that labels that have been created once stay in the dictionary forever, even ~ if subsequent passes neither use nor define them. That's because the control ~ loop has no way to know if they're still important; it's up to downstream ~ code to decide that. Everything has been carefully set up so that disused ~ labels don't hurt anything. : list-labels labels oldest-entry-in { dup } { dup label-heading labels next-newer-entry-in } while drop ; ~ This creates a new label given a name for it, initializing its value and ~ status to zero and adding it to the dictionary. This is responsible for the ~ initial guess of zero on the first pass. ~ ~ (name string pointer -- ) : new-label labels create-in 0 , 0 , ; ~ These helpers take a label entry pointer and return pointers to the status ~ and value fields. : label-status entry-to-execution-token ; : label-value entry-to-execution-token 8 + ; ~ This looks up a label by name if it exists, or creates it if it doesn't. ~ Either way, it returns an entry pointer. It's named after the function ~ "intern" that many Lisp dialects have. ~ ~ (name string pointer -- entry pointer) : intern-label dup labels swap find-in dup { swap drop } { drop dup new-label labels swap find-in } if-else ; ~ This returns the value of a label, also doing all necessary status checks ~ and updates to keep track of the circumstances under which it was used. The ~ label loop relies on all read accesses going through this word. ~ ~ (label entry pointer -- label value) : use-label ~ If it hasn't been defined yet, mark it used-before-set. dup label-status dup @ dup 0x02 & not { 0x04 | swap ! } { drop drop } if-else ~ Mark it used. It's faster to just do this than to check if it's needed. dup label-status dup @ 0x01 | swap ! ~ Return the guessed value. label-value @ ; ; ~ This overwrites the value of a label, also doing all necessary status ~ checks and jupdates to keep track of the cirumstances under which it was ~ set. The label loop relies on all write accesses going through this word. ~ ~ (new label value, label entry pointer --) : set-label ~ We always set the defined bit to true. We leave the other status bits ~ as-is, except that we check whether it's unused up to now, and whether ~ the previous value equals the new value. If either of those holds, we set ~ the guessed-equals-actual bit to true. dup label-status @ 0x02 | ~ (new value, entry, updated status) 3unroll dup 4 unroll label-value @ swap dup 3unroll = ~ (entry, updated status, new value, equality) 3roll dup 4 unroll 0x01 & not ~ (entry, updated status, new value, equality, unused) || { swap 0x08 | swap } if ~ (entry, updated status, new value) 3roll dup 3unroll ~ (updated status, entry, new value, entry) label-value ! label-status ! ; ~ This is a convenience helper which downstream code can use to check how ~ many bytes it has output thus-far. ~ (output memory start, current output point ~ -- output memory start, current output point, offset) : current-offset 2dup swap - ; ~ This is a concise syntax for referencing the entry pointer of a label, ~ when you know its name statically. It reads a word of text input and calls ~ intern-label on it. : L' word value@ interpreter-flags @ 0x01 & { ' litstring entry-to-execution-token , here @ swap packstring 8 packalign here ! ' intern-label entry-to-execution-token , dropstring } { intern-label dropstring-with-result } if-else ; make-immediate ~ This is a concise syntax for calling use-label with a label whose name ~ you know statically. It performs L', then calls use-label. : L@' ' L' entry-to-execution-token execute interpreter-flags @ 1 & { ' use-label entry-to-execution-token , } { use-label } if-else ; make-immediate ~ This is a concise syntax for calling set-label with a label whose name ~ you know statically. It performs L', then calls set-label. : L!' ' L' entry-to-execution-token execute interpreter-flags @ 1 & { ' set-label entry-to-execution-token , } { set-label } if-else ; make-immediate ~ This is an internal helper that label-loop uses to check if a specific ~ label entry has "converged", based on its status bits. Downstream code won't ~ need to call this directly. ~ ~ For a label to have "converged", at least one of the following must be ~ true: ~ ~ 1. The label must never have been used (bit zero clear); ~ 2. The label was both used and defined, but not used before it was defined ~ (bits zero and one set; bit two clear); ~ 3. The label was both used and defined, and the guessed value equalled the ~ actual value (bits zero, one, and three set). ~ ~ (label entry pointer -- boolean) : check-label-converged label-status @ dup 0x01 & not swap dup 0x03 & 0x03 = swap dup 0x04 & not swap 3unroll && swap 0x0b & 0x0b = || || ; ~ This is an internal helper that label-loop uses to check if the overall ~ label assignments have converged, based on their status bits. Downstream ~ code won't need to call this directly. ~ ~ This returns true if and only if check-label-converged returns true for ~ all the labels that have been created. ~ ~ (-- boolean) : check-labels-converged 1 labels @ { dup } { dup check-label-converged 3roll && swap @ } while drop ; ~ This is an internal helper that label-loop calls between passes, to update ~ the status bits and get ready for the next one. Downstream code won't need ~ to call this directly. ~ : reset-labels labels @ { dup } { dup label-status 0 swap ! @ } while drop ; ~ This is the top-level word that invokes the entire label system. All the ~ code generation happens inside it. ~ ~ The execution token it's passed should have the interface: ~ ~ (output start, current output pointer ~ -- output start, current output pointer) ~ ~ In general, this is the same interface that code generation words should ~ use to communicate with each other. For example, all the words in elf.e ~ use this interface. ~ ~ (execution token -- output start, output length) : label-loop 0 swap 0x1000 allocate dup ~ (iteration count, execution token, output start, output point) { 3 pick 100 > } { 2 pick execute 4 roll 1+ 4 unroll check-labels-converged { 4 roll drop 3 roll drop current-offset swap drop exit } if drop dup reset-labels } while drop drop drop ." Failed after " . ." iterations." newline ;