~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ ~~ Bootstrapping the log ~~ ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ ~ The log is the main region of memory within which most dynamic allocation ~ happens. It's a single contiguous segment of virtual memory, which is ~ requested from the kernel when Evocation starts up. Almost all of ~ Evocation's dynamic data is kept in the log, including the main dictionary; ~ several important global variables which make it possible to find and ~ allocate other data structures; and the control stack. ~ ~ This file has the task of providing words which are useful for working ~ with the log, and more specifically which are useful for helping to bring ~ the log into existence. Once the log exists, it can be used to manage ~ itself, but there's a bootstrapping challenge in getting there. That ~ challenge is solved by the warm-start routine in execution.e, which relies ~ on the words in this file and should load after it. ~ ~ Some modern Forths, including Jonesforth, refer to the log as the heap. ~ This is a misnomer; a heap is a data structure that allows non-contiguous ~ allocation. Although there are Forths that have true heaps, Evocation is not ~ one of them. Space in the log is allocated by incrementing the "here" ~ variable (one of those important globals), which necessarily can only ~ allocate contiguous blocks; there is no way to compact allocations to ~ reclaim fragmented, unused space in between them. Evocation does allow ~ deallocation using "forget", but this is done by resetting "here" and ~ "latest" to older values, unwinding every allocation that's been done since ~ the point in time they return to. ~ ~ It would be a mistake to confuse this allocation strategy with the ~ more-general facilities for allocation, reallocation, and deallocation of ~ individual memory blocks that many other languages have. To avoid confusion, ~ we stay away from the name "heap", though it may still occasionally be used ~ colloquially because it's familiar from other Forths, and because most ~ programming languages have a heap as the main memory segment they request ~ from the kernel. ~ ~ In the strictest technical sense, the log is a stack: Things are added ~ to the end of it, and removed from that same end. However, Evocation already ~ has two other stacks, the control and value stacks. Adding to the potential ~ confusion, the control stack is actually stored inside the log (as a ~ fixed-size chunk at the bottom). However, the log isn't really that much ~ like a stack when you look at how it's actually used. Unlike Evocation's ~ control and value stacks, data structures on the log tend to be rich and ~ complex, interlinked in various ways through the use of pointers. They also ~ tend to be long-lived, with the log tending to grow over time, whereas the ~ control and value stacks tend to remain roughly the same size through cycles ~ of growth and shrinking. In order to be able to speak precisely about what ~ we're doing, we introduce the name "log" to refer to the entire memory ~ segment and everything stored within it. ~ ~ Another linguistic choice we make is to be clear about dictionaries. A ~ dictionary is a linked list of word entries. Each dictionary has a specific ~ handle, a pointer to a pointer, which is the root of the list. Each ~ word entry begins with a specific data structure, which among other things ~ includes a next-entry pointer, a flags byte, and a string that serves as ~ the entry's name. Older entries in a dictionary seldom change; newer entries ~ are added at the beginning of it, with their next-entry pointers leading to ~ the older entries. It is possible for several dictionaries to exist at once, ~ each with its own dictionary handle. ~ ~ Since dictionaries are managed using pointers to individual entries, there ~ is no specific requirement about the order in which those entries occur in ~ memory or where they are allocated, but usually a new entry is allocated at ~ the end of the log, by incrementing the variable "here", in the same manner ~ as any other allocation. There is one particular dictionary, the main ~ dictionary, whose handle is the variable "latest". The main dictionary holds ~ every executable word that can be used normally via Evocation's interpreter. ~ ~ Since the main dictionary is by far the most important thing in the log, ~ it can be tempting to conflate the log with the main dictionary. This is ~ accurate enough for some purposes, but note that other dictionaries are ~ often interleaved with it, their allocations entwining like grape vines even ~ while each remains separate, reachable only via its own root. See the ~ machine label facility, in labels.e, for an example of how a secondary ~ dictionary can be useful. ~ ~ This may feel tangential, but it's important background and there's no ~ better place to explain it: A handle is a pointer to a pointer. The variable ~ "latest" returns a handle, a fixed address which always holds the pointer to ~ the root entry of the main dictionary. Dereferencing that handle gives you ~ the dictionary pointer, the address of the root entry, which is suitable to ~ pass to find-in and similar words that read the dictionary's contents. When ~ you want to add a new entry to a dictionary, you need the dictionary's ~ handle, so that the root pointer can be changed. When you only want to write ~ it, you only need the regular single pointer. ~ ~ When reading the documentation of words that work with dictionaries, pay ~ close attention to whether their parameters include a dictionary handle, or ~ a dictionary pointer. ~ ~ The term "handle" was widely known in the early days of microcomputing, ~ when memory-safe languages without direct pointer access were less common. ~ Today it is usually considered specific to systems programming, the type of ~ programming which lies beneath other software and deals with topics such as ~ memory management and processes. Evocation is a systems-programming ~ language, in the sense that it takes pains to not introduce mandatory ~ abstractions which would make it difficult or inefficient to work directly ~ with these topics. So, in understanding Evocation, it's important to know ~ about handles. ~ Find-in is the main word that provides the capability to look up words by ~ name, though it's usually used via "find" rather than being called directly. ~ ~ Find-in traverses the linked list formed by a particular dictionary's ~ next-entry pointers, looking for an entry that matches a given name. The ~ dictionary pointer is the pointer (not handle) to the root of the list, ~ which runs from newest to oldest. For example, dereferencing the value of ~ "latest" gives the pointer to the main dictionary, which can be passed to ~ find-in. ~ ~ Having find-in separated out is convenient when working with alternate ~ dictionaries, but the main reason for having it is not convenience but ~ necessity: During Evocation's startup, there is a period before global ~ variables are easily accessible, so there would be no way to implement ~ "find". The warm-start routine (see execution.e and transform.e) has the ~ job of fixing that, and it makes extensive use of find-in to do so. ~ ~ (dictionary pointer, string pointer -- entry pointer or 0) : find-in ~ It will be more convenient to have the entry pointer on top. swap { ~ If the entry pointer is null, exit. ~ (name pointer to find, current entry pointer) dup 0 = { swap drop exit } if ~ Check this entry's "hidden" flag. ~ (name pointer to find, current entry pointer) dup entry-flags@ 0x80 & 0x80 != { ~ Test whether this entry is a match. ~ (name pointer to find, current entry pointer) 2dup 10 + stringcmp 0 = { ~ If we're here, it's a match. Clean up our working state and exit. ~ (name pointer to find, current entry pointer) swap drop exit } if } if ~ If we're here, it's not a match; traverse the pointer and repeat. ~ (name pointer to find, current entry pointer) @ } forever ; ~ This has the same value as the constant control-stack-size, which is ~ defined in execution.e. Everything will break if it doesn't. ~ ~ TODO: remove one of them. Probably the other one. : log-offset 0x10000 ; ~ 64 KiB ~ (log address -- log address, "latest" pointer) : log-load-latest dup log-offset + 3 8 * + ; ~ (log address -- log address, "latest" pointer) : log-load-here dup log-offset + 4 8 * + ; ~ This is a helper used by warm-start, which invokes find-in using "latest". ~ It relies on being passed the root address of the log, which is used to find ~ the global variable "latest". It's inconvenient to keep a log pointer around ~ all the time, which is why we stop doing it as soon as possible, but during ~ Evocation's startup there's no alternative. This word is used extensively ~ by code that's been compiled via the log-load transform; see transform.e for ~ details. ~ ~ It would be possible to unload this word after the log is created, but ~ there are rare situations in which it's still useful, such as injecting ~ Evocation into another process's address space. Plus, it's small. So, we ~ keep it around. ~ ~ (log address, string pointer -- log address, entry pointer or 0) : log-load-find swap log-load-latest @ swap 3unroll swap find-in ;