summary refs log tree commit diff
path: root/elf.e
diff options
context:
space:
mode:
Diffstat (limited to 'elf.e')
-rw-r--r--elf.e159
1 files changed, 115 insertions, 44 deletions
diff --git a/elf.e b/elf.e
index 4224b50..c38f740 100644
--- a/elf.e
+++ b/elf.e
@@ -1,57 +1,128 @@
-~ ~~
-~ ~~ ELF header
-~ ~~
-~ ~~   This is the top-level ELF header, for the entire file. An ELF always
-~ ~~ has exactly one of this header, which is always at the start of the file.
-~ ~~
+~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~ ~~ Executable file format ~~
+~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~
+~   Before we do anything specific to the actual program we're building, we
+~ do a lot of ELF-specific stuff to ensure that our output is in a format
+~ Linux knows how to run.
+~
+~   This relies on the label facility defined in labels.e. Make sure to load
+~ that first.
+
+~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~ ~~ Runtime memory origin ~~
+~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~
+~   First, we pick an origin to load at. This is arbitrary, but it can't be
+~ zero. We define a constant word for it so the body of the program can use
+~ it in label calculations in whatever ways it needs to.
+
+: origin 0x08000000 ;
+
+
+~ ~~~~~~~~~~~~~~~~~~~~~
+~ ~~ ELF file header ~~
+~ ~~~~~~~~~~~~~~~~~~~~~
+~
+~   Second, we output ELF's top-level file header. This header describes the
+~ entire file. An ELF always has exactly one of this header, which is always
+~ at the start of the file.
+~
+~   The program we're building should call this word as the first output it
+~ generates.
+~
+~   The only interesting thing here is the entry pointer.
+
 : elf-file-header
-  0x7f pack8 s" ELF" pack-raw-string        ~ magic number
-  2 pack8                                   ~ 64-bit
-  1 pack8                                   ~ little-endian
-  1 pack8                                   ~ ELF header format v1
-  0 pack8                                   ~ System-V ABI
-  0 pack64                                  ~ (padding)
-
-  2 pack16                                  ~ executable
-  0x3e pack16                               ~ Intel x86-64
-  1 pack32                                  ~ ELF format version
-
-  L' start use-label origin + pack64        ~ entry point
+  ~ * denotes mandatory fields according to breadbox
+  current-offset 3unroll
+
+  0x7f pack8 s" ELF" pack-raw-string    ~ *magic number
+  2 pack8                               ~ 64-bit
+  1 pack8                               ~ little-endian
+  1 pack8                               ~ ELF header format v1
+  0 pack8                               ~ System-V ABI
+  0 pack64                              ~ (padding)
+
+  2 pack16                              ~ *executable
+  0x3e pack16                           ~ *Intel x86-64
+  1 pack32                              ~ ELF format version
+
+  L@' cold-start origin + pack64        ~ *entry point
     ~ This includes the origin, intentionally.
 
-  L' program-header use-label pack64        ~ program header offset
+  L@' elf-program-header pack64         ~ *program header offset
     ~ We place the program header immediately after the ELF header. This
     ~ offset is from the start of the file.
-  0 pack64                                  ~ section header offset
-  0 pack32                                  ~ processor flags
-  64 pack16                                 ~ ELF header size
-  56 pack16                                 ~ program header entry size
-  1 pack16                                  ~ number of program header entries
-  0 pack16                                  ~ section header entry size
-  0 pack16                                  ~ number of section header entries
-  0 pack16                                  ~ section name string table index
-  ;
+  0 pack64                              ~ section header offset
+  0 pack32                              ~ processor flags
+
+  L@' elf-header-size pack16            ~ ELF header size
+  L@' elf-program-header-size pack16     ~ *program header entry size
+  1 pack16                              ~ *number of program header entries
+  0 pack16                              ~ section header entry size
+  0 pack16                              ~ number of section header entries
+  0 pack16                              ~ section name string table index
+
+  ~   Though hardcoding the size of this header would work fine, it's easier
+  ~ to use the label system to keep track of its size. The only place this is
+  ~ actually referenced is right here in the header.
+  current-offset 4 roll - L!' elf-header-size ;
+
+
+~ ~~~~~~~~~~~~~~~~~~~~~~~~
+~ ~~ ELF program header ~~
+~ ~~~~~~~~~~~~~~~~~~~~~~~~
+~
+~   Third, we output ELF's program header, which lists the memory regions
+~ ("segments") we want to have and where we want them to come from. There may
+~ be any number of these entries, one per segment, , and they may be anywhere
+~ in the file as long as they're consecutive.
+~
+~   We list just a single region, which is the entire contents of the ELF file
+~ from disk, and we put the program header immediately after the file header.
+~ The program we're building should call this word as the second output it
+~ generates.
+~
+~   It would be more typical to use this header to ask the loader to give us
+~ separate code and data segments, and perhaps a stack or heap, but this keeps
+~ things simple, and we can create those things for ourselves later.
+~
+~   We do have a little stack space available, though we don't explicitly
+~ request any; the kernel allocates it for us as part of exec() so that it can
+~ pass us argc and argv (which we ignore). That stack space will be at a
+~ random address, different every time, because of ASLR; that's a neat
+~ security feature, so we leave it as-is. Note that ASLR doesn't happen when
+~ you run under gdb, so if you aren't seeing it, that's probably why.
 
-~ ~~
-~ ~~ Program header
-~ ~~
-~ ~~   An ELF program header consists of any number of these entries; they are
-~ ~~ always consecutive, but may be anywhere in the file. We always have
-~ ~~ exactly one, and it's always right after the ELF file header.
 ~ ~~
 : elf-program-header
-  current-offset L' program-header set-label
-  1 pack32                                  ~ "loadable" segment type
-  0x05 pack32                               ~ read+execute permission
-  0 pack64                                  ~ offset in file
-  origin pack64                             ~ virtual address
+  ~ * denotes mandatory fields according to breadbox
+  current-offset L!' elf-program-header
+  current-offset 3unroll
+
+  1 pack32                              ~ *"loadable" segment type
+  0x05 pack32                           ~ *read+execute permission
+  0 pack64                              ~ *offset in file
+  origin pack64                         ~ *virtual address
     ~ required, but can be anything, subject to alignment
-  0 pack64                                  ~ physical address (ignored)
+  0 pack64                              ~ physical address (ignored)
 
-  L' total-size use-label pack64            ~ size in file
-  L' total-size use-label pack64            ~ size in memory
+  L@' total-size pack64                 ~ *size in file
+  L@' total-size pack64                 ~ *size in memory
 
-  0 pack64                                  ~ segment alignment
+  0 pack64                              ~ segment alignment
     ~ for relocation, but this doesn't apply to us
-  ;
+
+  ~   As with the file header, we use the label system to keep track of the
+  ~ program header's size.
+  current-offset 4 roll - L!' elf-program-header-size ;
+
+~ ~~~~~~~~~~~~~~~~
+~ ~~ That's it! ~~
+~ ~~~~~~~~~~~~~~~~
+~
+~   ELF is a simple format, really.  Now you can output your own machine code
+~ that you generate however you want; make sure to define the label
+~ cold-start, which will be the first thing that runs.