Skip to content

Commit 1532aa5

Browse files
committed
[WIP] Add new documents to explain static and dynamic linking modes
Since the compiler supports both static linking and dynamic linking, this commit adds two new documents to explain the following: - Describe how to build static/dynamic linking version of shecc. - Stack frame layout in static/dynamic linking modes. - Function arguments handling and calling convention. - Runtime execution flow. - Explain the dynamic sections for dynamic linking mode.
1 parent d745196 commit 1532aa5

File tree

2 files changed

+218
-0
lines changed

2 files changed

+218
-0
lines changed

docs/dynamic-linking.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# Dynamic Linking
2+
3+
## Build dynamically linked shecc and programs
4+
5+
Build the dynamically linked version of shecc, but notice that shecc currently doesn't support dynamic linking for the RISC-V architecture:
6+
7+
```shell
8+
$ make ARCH=arm DYNLINK=1
9+
```
10+
11+
Next, you can use shecc to build dynamically linked programs by adding the `--dynlink` flag:
12+
13+
```shell
14+
# Use the stage 0 compiler
15+
$ out/shecc --dynlink -o <output> <input.c>
16+
# Use the stage 1 or stage 2 compiler
17+
$ qemu-arm -L <LD_PREFIX> out/shecc-stage2.elf --dynlink -o <output> <input.c>
18+
19+
# Execute the compiled program
20+
$ qemu-arm -L <LD_PREFIX> <output>
21+
```
22+
23+
When executing a dynamically linked program, you should set the ELF interpreter prefix so that `ld.so` can be invoked. Generally, it should be `/usr/arm-linux-gnueabihf` if you have installed the ARM GNU toolchain by `apt`. Otherwise, you should find and specify the correct path if you manually installed the toolchain.
24+
25+
## Stack frame layout
26+
27+
In dynamic linking mode, the stack frame layout for each function can be illustrated as follows:
28+
29+
```
30+
High Address
31+
+------------------+
32+
| incoming args |
33+
+------------------+ <- sp + total_size
34+
| saved lr |
35+
+------------------+ <- sp + total_size - 4
36+
| saved r11 |
37+
+------------------+
38+
| saved r10 |
39+
+------------------+
40+
| saved r9 |
41+
+------------------+
42+
| saved r8 |
43+
+------------------+
44+
| saved r7 |
45+
+------------------+
46+
| saved r6 |
47+
+------------------+
48+
| saved r5 |
49+
+------------------+
50+
| saved r4 |
51+
+------------------+
52+
| local variables |
53+
+------------------+ <- <- sp + (MAX_PARAMS - MAX_ARGS_IN_REG) * 4
54+
| outgoing args |
55+
+------------------+ <- sp (MUST be aligned to 8 bytes)
56+
Low Address
57+
```
58+
59+
* `total_size`: includes the size of the following elements:
60+
* `outgoing args`: a fixed size - `(MAX_PARAMS - MAX_ARGS_IN_REG) * 4` bytes
61+
* All local variables
62+
* `saved r4-r11 and lr`: a fixed size - 36 bytes
63+
64+
65+
## About function arguments handling
66+
67+
### Arm (32-bit)
68+
69+
If the callee is an internal function meaning that its implementation is compiled by shecc, the caller directly puts all arguments into register `r0` - `r7`.
70+
71+
Conversely, the caller performs the following operations to comply with the Arm Architecture Procedure Call Standard (AAPCS).
72+
73+
* First four arguments are put into `r0` - `r3`
74+
* Other additional arguments are passed to stack. Arguments are pushed onto stack starting from the last argument, so the fifth argument is at the lower address and the last argument is at the higher address.
75+
* Align the stack pointer to 8 bytes, as external functions may access 8-byte objects, which require 8-byte alignment.
76+
77+
### RISC-V (32-bit)
78+
79+
(Currently not supported)
80+
81+
## Runtime execution flow
82+
83+
1. Program starts at ELF entry point.
84+
2. Dynamic linker (`ld.so`) is invoked.
85+
* For the Arm architecture, the dynamic linker is `/lib/ld-linux-armhf.so.3`.
86+
3. Linker loads shared libraries such as `libc.so`.
87+
4. Linker resolves symbols and fills global offset table (GOT).
88+
5. Control transfers to the program.
89+
6. Program executes `__libc_start_main` at the beginning.
90+
7. `__libc_start_main` calls the *main wrapper*, which pushes registers r4-r11 and lr onto stack, sets up a global stack for all global variables (excluding read-only variables), and initializes them.
91+
8. Execute the *main wrapper*, and then invoke the main function.
92+
9. After the `main` function returns, the *main wrapper* restores the necessary registers and passes control back to `__libc_start_main`, which implicitly calls `exit(3)` to terminate the program.
93+
94+
## Dynamic sections
95+
96+
When using dynamic linking, the following sections are generated for compiled programs:
97+
98+
1. `.interp` - Path to dynamic linker
99+
2. `.dynsym` - Dynamic symbol table
100+
3. `.dynstr` - Dynamic string table
101+
4. `.rel.plt` - PLT relocations
102+
5. `.plt` - Procedure Linkage Table
103+
6. `.got` - Global Offset Table
104+
7. `.dynamic` - Dynamic linking information
105+
106+
### PLT explanation for Arm32
107+
108+
The first entry contains the following instructions to invoke resolver to perform relocation.
109+
110+
```
111+
push {lr} @ (str lr, [sp, #-4]!)
112+
movw sl, #:lower16:(&GOT[2])
113+
movt sl, #:upper16:(&GOT[2])
114+
mov lr, sl
115+
ldr pc, [lr]
116+
```
117+
118+
1. Push register `lr` onto stack.
119+
2. Set register `sl` to the address of `GOT[2]`.
120+
3. Move the value of `sl` to `lr`.
121+
4. Load the value located at `[lr]` into the program counter (`pc`).
122+
123+
124+
125+
The remaining entries correspond to all external functions, with each entry including the following instructions:
126+
127+
```
128+
movw ip, #:lower16:(&GOT[x])
129+
movt ip, #:upper16:(&GOT[x])
130+
ldr pc, [ip]
131+
```
132+
133+
1. Set register `ip` to the address of `GOT[x]`.
134+
2. Assign register `pc` to the value of `GOT[x]`. That is, set `pc` to the address of the callee.
135+

docs/static-linking.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Static Linking
2+
3+
## Build statically linked shecc and programs
4+
5+
Build the statically linked version of shecc:
6+
7+
```shell
8+
$ make ARCH=<target arch>
9+
```
10+
11+
Next, you can use shecc to generate statically linked programs. The following demonstration uses shecc targeting Arm architecture to illustrate:
12+
13+
```shell
14+
# Use the stage 0 compiler
15+
$ out/shecc -o <output> <input.c>
16+
# Use the stage 1 or stage 2 compiler
17+
$ qemu-arm out/shecc-stage2.elf -o <output> <input.c>
18+
19+
# Execute the compiled program
20+
$ qemu-arm <output>
21+
```
22+
23+
## Stack frame layout
24+
25+
In static linking mode, the stack frame layout for each function can be illustrated as follows:
26+
27+
```
28+
High Address
29+
+------------------+
30+
| incoming args |
31+
+------------------+ <- sp + total_size
32+
| saved lr |
33+
+------------------+ <- sp + total_size - 4
34+
| saved r11 |
35+
+------------------+
36+
| saved r10 |
37+
+------------------+
38+
| saved r9 |
39+
+------------------+
40+
| saved r8 |
41+
+------------------+
42+
| saved r7 |
43+
+------------------+
44+
| saved r6 |
45+
+------------------+
46+
| saved r5 |
47+
+------------------+
48+
| saved r4 |
49+
+------------------+
50+
| local variables |
51+
+------------------+ <- <- sp + (MAX_PARAMS - MAX_ARGS_IN_REG) * 4
52+
| outgoing args |
53+
+------------------+ <- sp (MUST be aligned to 8 bytes)
54+
Low Address
55+
```
56+
57+
* `total_size`: includes the size of the following elements:
58+
* `outgoing args`: a fixed size - `(MAX_PARAMS - MAX_ARGS_IN_REG) * 4` bytes
59+
* All local variables
60+
* `saved r4-r11 and lr`: a fixed size - 36 bytes
61+
62+
## About function arguments handling
63+
64+
In the current implementation, the maximal number of arguments that shecc can handle is 8.
65+
66+
### Arm (32-bit)
67+
68+
In the Arm Architecture Procedure Calling Standard (AAPCS), if the number of arguments is greater than 4, only the first four arguments are stored in `r0` - `r3`, and the remaining arguments should be pushed onto stack. Additionally, the stack must be properly aligned.
69+
70+
However, shecc puts all arguments to register `r0` - `r7` even if the number of arguments exceeds 4. Since all functions are compiled by shecc in static linking mode, execution can still succeed by retrieving arguments from `r0` - `r7`, even though this does not comply with the AAPCS.
71+
72+
### RISC-V (32-bit)
73+
74+
In the RISC-V architecture, the maximal number of arguments that can be put into registers is 8, so shecc also puts all arguments to `a0` - `a7` directly. Therefore, the compiled programs are fully compliant with the RISC-V calling convention as long as the number of arguments does not exceed 8.
75+
76+
If shecc needs to support handling more arguments in the future, it should be improved to generate instructions to push extra arguments onto stack properly.
77+
78+
## Runtime execution flow
79+
80+
1. Program starts at ELF entry point.
81+
2. Execute the *main wrapper*, which sets up a global stack for all global variables (but excluding read-only variables) and initializes them.
82+
3. After the *main wrapper* completes, it retrieves `argc` and `argv` from stack, puts them into registers properly, and calls the `main` function to continue execution.
83+
4. After the `main` function returns, use the `_exit` system call to terminate the program.

0 commit comments

Comments
 (0)