summary refs log tree commit diff
path: root/elf.e
blob: 7801bf6645f6643911ef05c71405aba3f90df512 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ Executable file format ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~
~   Before we do anything specific to the actual program we're building, we
~ do a lot of ELF-specific stuff to ensure that our output is in a format
~ Linux knows how to run.
~
~   This relies on the label facility defined in labels.e. Make sure to load
~ that first.

~ ~~~~~~~~~~~~~~~~~~~~~
~ ~~ ELF file header ~~
~ ~~~~~~~~~~~~~~~~~~~~~
~
~   First, we output ELF's top-level file header. This header describes the
~ entire file. An ELF always has exactly one of this header, which is always
~ at the start of the file.
~
~   The program we're building should call this word as the first output it
~ generates.
~
~   The only interesting thing here is the entry pointer.

: elf-file-header
  ~ * denotes mandatory fields according to breadbox
  current-offset 3unroll

  0x7f pack8 s" ELF" pack-raw-string    ~ *magic number
  2 pack8                               ~ 64-bit
  1 pack8                               ~ little-endian
  1 pack8                               ~ ELF header format v1
  0 pack8                               ~ System-V ABI
  0 pack64                              ~ (padding)

  2 pack16                              ~ *executable
  0x3e pack16                           ~ *Intel x86-64
  1 pack32                              ~ ELF format version

  L@' cold-start L@' origin + pack64        ~ *entry point
    ~ This includes the origin, intentionally.

  L@' elf-program-header pack64         ~ *program header offset
    ~ We place the program header immediately after the ELF header. This
    ~ offset is from the start of the file.
  0 pack64                              ~ section header offset
  0 pack32                              ~ processor flags

  L@' elf-header-size pack16            ~ ELF header size
  L@' elf-program-header-size pack16    ~ *program header entry size
  1 pack16                              ~ *number of program header entries
  0 pack16                              ~ section header entry size
  0 pack16                              ~ number of section header entries
  0 pack16                              ~ section name string table index

  ~   Though hardcoding the size of this header would work fine, it's easier
  ~ to use the label system to keep track of its size. The only place this is
  ~ actually referenced is right here in the header.
  current-offset 4 roll - L!' elf-header-size ;


~ ~~~~~~~~~~~~~~~~~~~~~~~~
~ ~~ ELF program header ~~
~ ~~~~~~~~~~~~~~~~~~~~~~~~
~
~   Second, we output ELF's program header, which lists the memory regions
~ ("segments") we want to have and where we want them to come from. There may
~ be any number of these entries, one per segment, , and they may be anywhere
~ in the file as long as they're consecutive.
~
~   We list just a single region, which is the entire contents of the ELF file
~ from disk, and we put the program header immediately after the file header.
~ The program we're building should call this word as the second output it
~ generates.
~
~   It would be more typical to use this header to ask the loader to give us
~ separate code and data segments, and perhaps a stack or heap, but this keeps
~ things simple, and we can create those things for ourselves later.
~
~   We do have a little stack space available, though we don't explicitly
~ request any; the kernel allocates it for us as part of exec() so that it can
~ pass us argc and argv (which we ignore). That stack space will be at a
~ random address, different every time, because of ASLR; that's a neat
~ security feature, so we leave it as-is. Note that ASLR doesn't happen when
~ you run under gdb, so if you aren't seeing it, that's probably why.

~ ~~
: elf-program-header
  ~ * denotes mandatory fields according to breadbox
  current-offset L!' elf-program-header
  current-offset 3unroll

  1 pack32                              ~ *"loadable" segment type
  0x05 pack32                           ~ *read+execute permission
  0 pack64                              ~ *offset in file
  L@' origin pack64                     ~ *virtual address
    ~ required, but can be anything, subject to alignment
  0 pack64                              ~ physical address (ignored)

  L@' total-size pack64                 ~ *size in file
  L@' total-size pack64                 ~ *size in memory

  0 pack64                              ~ segment alignment
    ~ for relocation, but this doesn't apply to us

  ~   As with the file header, we use the label system to keep track of the
  ~ program header's size.
  current-offset 4 roll - L!' elf-program-header-size ;


~   This alternate version of the program header isn't used by evoke.e, but
~ may be useful for other programs, such as hex.e. Its only difference from
~ the base version is that it adds the write permission.
: elf-program-header-writable
  ~ * denotes mandatory fields according to breadbox
  current-offset L!' elf-program-header
  current-offset 3unroll

  1 pack32                              ~ *"loadable" segment type
  0x07 pack32                           ~ *read+write+execute permission
  0 pack64                              ~ *offset in file
  L@' origin pack64                     ~ *virtual address
    ~ required, but can be anything, subject to alignment
  0 pack64                              ~ physical address (ignored)

  L@' total-size pack64                 ~ *size in file
  L@' total-size pack64                 ~ *size in memory

  0 pack64                              ~ segment alignment
    ~ for relocation, but this doesn't apply to us

  ~   As with the file header, we use the label system to keep track of the
  ~ program header's size.
  current-offset 4 roll - L!' elf-program-header-size ;

~ ~~~~~~~~~~~~~~~~
~ ~~ That's it! ~~
~ ~~~~~~~~~~~~~~~~
~
~   ELF is a simple format, really.  Now you can output your own machine code
~ that you generate however you want; make sure to define the labels "origin"
~ and "cold-start". Origin will control the address the code loads at;
~ cold-start will be the first thing that runs. The origin is arbitrary, but
~ can't be zero.