summary refs log tree commit diff
path: root/elf.e
blob: c38f740794625355d200f45c00b551c3eff4b04f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ Executable file format ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~
~   Before we do anything specific to the actual program we're building, we
~ do a lot of ELF-specific stuff to ensure that our output is in a format
~ Linux knows how to run.
~
~   This relies on the label facility defined in labels.e. Make sure to load
~ that first.

~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ Runtime memory origin ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
~
~   First, we pick an origin to load at. This is arbitrary, but it can't be
~ zero. We define a constant word for it so the body of the program can use
~ it in label calculations in whatever ways it needs to.

: origin 0x08000000 ;


~ ~~~~~~~~~~~~~~~~~~~~~
~ ~~ ELF file header ~~
~ ~~~~~~~~~~~~~~~~~~~~~
~
~   Second, we output ELF's top-level file header. This header describes the
~ entire file. An ELF always has exactly one of this header, which is always
~ at the start of the file.
~
~   The program we're building should call this word as the first output it
~ generates.
~
~   The only interesting thing here is the entry pointer.

: elf-file-header
  ~ * denotes mandatory fields according to breadbox
  current-offset 3unroll

  0x7f pack8 s" ELF" pack-raw-string    ~ *magic number
  2 pack8                               ~ 64-bit
  1 pack8                               ~ little-endian
  1 pack8                               ~ ELF header format v1
  0 pack8                               ~ System-V ABI
  0 pack64                              ~ (padding)

  2 pack16                              ~ *executable
  0x3e pack16                           ~ *Intel x86-64
  1 pack32                              ~ ELF format version

  L@' cold-start origin + pack64        ~ *entry point
    ~ This includes the origin, intentionally.

  L@' elf-program-header pack64         ~ *program header offset
    ~ We place the program header immediately after the ELF header. This
    ~ offset is from the start of the file.
  0 pack64                              ~ section header offset
  0 pack32                              ~ processor flags

  L@' elf-header-size pack16            ~ ELF header size
  L@' elf-program-header-size pack16     ~ *program header entry size
  1 pack16                              ~ *number of program header entries
  0 pack16                              ~ section header entry size
  0 pack16                              ~ number of section header entries
  0 pack16                              ~ section name string table index

  ~   Though hardcoding the size of this header would work fine, it's easier
  ~ to use the label system to keep track of its size. The only place this is
  ~ actually referenced is right here in the header.
  current-offset 4 roll - L!' elf-header-size ;


~ ~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ ELF program header ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~
~
~   Third, we output ELF's program header, which lists the memory regions
~ ("segments") we want to have and where we want them to come from. There may
~ be any number of these entries, one per segment, , and they may be anywhere
~ in the file as long as they're consecutive.
~
~   We list just a single region, which is the entire contents of the ELF file
~ from disk, and we put the program header immediately after the file header.
~ The program we're building should call this word as the second output it
~ generates.
~
~   It would be more typical to use this header to ask the loader to give us
~ separate code and data segments, and perhaps a stack or heap, but this keeps
~ things simple, and we can create those things for ourselves later.
~
~   We do have a little stack space available, though we don't explicitly
~ request any; the kernel allocates it for us as part of exec() so that it can
~ pass us argc and argv (which we ignore). That stack space will be at a
~ random address, different every time, because of ASLR; that's a neat
~ security feature, so we leave it as-is. Note that ASLR doesn't happen when
~ you run under gdb, so if you aren't seeing it, that's probably why.

~ ~~
: elf-program-header
  ~ * denotes mandatory fields according to breadbox
  current-offset L!' elf-program-header
  current-offset 3unroll

  1 pack32                              ~ *"loadable" segment type
  0x05 pack32                           ~ *read+execute permission
  0 pack64                              ~ *offset in file
  origin pack64                         ~ *virtual address
    ~ required, but can be anything, subject to alignment
  0 pack64                              ~ physical address (ignored)

  L@' total-size pack64                 ~ *size in file
  L@' total-size pack64                 ~ *size in memory

  0 pack64                              ~ segment alignment
    ~ for relocation, but this doesn't apply to us

  ~   As with the file header, we use the label system to keep track of the
  ~ program header's size.
  current-offset 4 roll - L!' elf-program-header-size ;

~ ~~~~~~~~~~~~~~~~
~ ~~ That's it! ~~
~ ~~~~~~~~~~~~~~~~
~
~   ELF is a simple format, really.  Now you can output your own machine code
~ that you generate however you want; make sure to define the label
~ cold-start, which will be the first thing that runs.