In last article Function Call with register EBP and ESP in x86, we've covered function call in x86. In x64, there're some differences in funciton call compared to x86; not just extend everything to 64 bit. In this article, I'm gonna show the layout of stack frame/activation record in x64.
What happened when it goes to 64 bit?
First of all, let's recall the example in the last article Function Call with register EBP and ESP in x86. In last article, we use -m32 option to make the program run in 32 bit mode, but this time we need to see what happens in 64 bit.
/* test.c */
#include <stdio.h>
int foo1(int a, int b) {
return a + b;
}
int foo2(int a, int b) {
int g = a + b;
return g;
}
int main() {
foo1(1, 2);
foo2(1, 2);
return 0;
}
Let's compile this program in GCC:
gcc -g -o test test.c
Now we use GDB to disassemble the target program: (on x86_64 Linux)
(gdb) disassemble foo1
Dump of assembler code for function foo1:
0x00000000004004d6 <+0>: push %rbp
0x00000000004004d7 <+1>: mov %rsp,%rbp
0x00000000004004da <+4>: mov %edi,-0x4(%rbp)
0x00000000004004dd <+7>: mov %esi,-0x8(%rbp)
0x00000000004004e0 <+10>: mov -0x4(%rbp),%edx
0x00000000004004e3 <+13>: mov -0x8(%rbp),%eax
0x00000000004004e6 <+16>: add %edx,%eax
0x00000000004004e8 <+18>: pop %rbp
0x00000000004004e9 <+19>: retq
End of assembler dump.
(gdb) disassemble foo2
Dump of assembler code for function foo2:
0x00000000004004ea <+0>: push %rbp
0x00000000004004eb <+1>: mov %rsp,%rbp
0x00000000004004ee <+4>: mov %edi,-0x14(%rbp)
0x00000000004004f1 <+7>: mov %esi,-0x18(%rbp)
0x00000000004004f4 <+10>: mov -0x14(%rbp),%edx
0x00000000004004f7 <+13>: mov -0x18(%rbp),%eax
0x00000000004004fa <+16>: add %edx,%eax
0x00000000004004fc <+18>: mov %eax,-0x4(%rbp)
0x00000000004004ff <+21>: mov -0x4(%rbp),%eax
0x0000000000400502 <+24>: pop %rbp
0x0000000000400503 <+25>: retq
End of assembler dump.
As we see in the result above, in function foo2, when it returns, RSP is not modified. That's wired! because in foo2, there's a local variable g. Stack pointer RSP is pointing to the top of the stack all the time, where is g stored? Let's debug it in GDB:
(gdb) x/10xw $rsp
0x7fffffffe0b0: 0xffffe0c0 0x00007fff 0x00400526 0x00000000
0x7fffffffe0c0: 0x00400530 0x00000000 0xf7a2e830 0x00007fff
0x7fffffffe0d0: 0x00000000 0x00000000
(gdb) x/10xw $rbp
0x7fffffffe0b0: 0xffffe0c0 0x00007fff 0x00400526 0x00000000
0x7fffffffe0c0: 0x00400530 0x00000000 0xf7a2e830 0x00007fff
0x7fffffffe0d0: 0x00000000 0x00000000
(gdb) x/10xw $rsp-4
0x7fffffffe0ac: 0x00000003 0xffffe0c0 0x00007fff 0x00400526
0x7fffffffe0bc: 0x00000000 0x00400530 0x00000000 0xf7a2e830
0x7fffffffe0cc: 0x00007fff 0x00000000
After function foo2 is about to return, we could see RSP equals to RBP, while beyond RSP, the local variable g was in RSP - 4. What happened here? Isn't the RSP pointing to the top the stack? Now, let's go to the main part of this article.
Register Extension in x64
x86 has 8 GPRs (General-Purpose Registers): EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI. x64 extended them to 64 bits: RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI and added another 8 GPRs: R8, R9, R10, R11, R12, R13, R14, R15. Remember, although they're called 'general-purpose registers', some of them are used for specific operation by default. The register extension in x64 broke the criticises on few GPRs in x86 architecture. More GPRs means the potential in increasing the performance of pipeling.
Activation Record Layout in x64
As it said above, x64 has more than 8 general-purpose registers, parameters are stored in some of those registers. According to the x86_64 ABI, the first 6 integer of pointer arguments of a function are passed by registers. Only the 7th argument and onwards are passed on the stack. Thus, in the example above, foo1's arguments a and b are stored in EDI and ESI respectively instead of in stack frame.
But how could local variable stay beyond the stack pointer? In x64, there's a reserved space which could be used to store local variables called "red zone" for activation record of a function without changing stack pointer RSP. Thus, the confusion in the beginning of this article now becomes clear. Now, let's look at how System V x86_64 ABI defines this "red zone":
The 128-byte area beyond the location pointed to by %rsp is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone.
The "red zone" specifies an 128 bytes area lower than RSP that will not be modified by signal or interrupt handlers. Thus foo2's local variable g could be stored in "red zone" of foo2 without altering RSP. In addition, leaf functions, those who are not gonna call any other funtions inside its function body, could use "red zone" as their entire stack frame.
Now, let's look at an code example based on the previous one
/* test.c */
#include <stdio.h>
int foo2(int, int);
int foo1(int a, int b, int c, int d, int e, int f, int g, int h) {
int sum = a + b + c + d + e + f + g + h;
int ave = sum / 8;
int count = foo2(sum, ave);
return count;
}
int foo2(int a, int b) {
int g = a / b;
return g;
}
int main() {
foo1(1, 2, 3, 4, 5, 6, 7, 8);
return 0;
}
Now we use GDB to disassemble the target functions respectively: (on x86_64 Linux)
main
(gdb) disassemble /m main
Dump of assembler code for function main:
18 long main() {
0x0000000000400583 <+0>: push %rbp
0x0000000000400584 <+1>: mov %rsp,%rbp
19 foo1(1, 2, 3, 4, 5, 6, 7, 8);
0x0000000000400587 <+4>: pushq $0x8
0x0000000000400589 <+6>: pushq $0x7
0x000000000040058b <+8>: mov $0x6,%r9d
0x0000000000400591 <+14>: mov $0x5,%r8d
0x0000000000400597 <+20>: mov $0x4,%ecx
0x000000000040059c <+25>: mov $0x3,%edx
0x00000000004005a1 <+30>: mov $0x2,%esi
0x00000000004005a6 <+35>: mov $0x1,%edi
0x00000000004005ab <+40>: callq 0x4004d6
0x00000000004005b0 <+45>: add $0x10,%rsp
20 return 0;
0x00000000004005b4 <+49>: mov $0x0,%eax
21 }
0x00000000004005b9 <+54>: leaveq
0x00000000004005ba <+55>: retq
End of assembler dump.
foo1
(gdb) disassembl /m foo1
Dump of assembler code for function foo1:
6 int foo1(int a, int b, int c, int d, int e, int f, int g, int h) {
0x00000000004004d6 <+0>: push %rbp
0x00000000004004d7 <+1>: mov %rsp,%rbp
0x00000000004004da <+4>: sub $0x30,%rsp
0x00000000004004de <+8>: mov %edi,-0x14(%rbp)
0x00000000004004e1 <+11>: mov %esi,-0x18(%rbp)
0x00000000004004e4 <+14>: mov %edx,-0x1c(%rbp)
0x00000000004004e7 <+17>: mov %ecx,-0x20(%rbp)
0x00000000004004ea <+20>: mov %r8d,-0x24(%rbp)
0x00000000004004ee <+24>: mov %r9d,-0x28(%rbp)
7 int sum = a + b + c + d + e + f + g + h;
0x00000000004004f2 <+28>: mov -0x14(%rbp),%edx
0x00000000004004f5 <+31>: mov -0x18(%rbp),%eax
0x00000000004004f8 <+34>: add %eax,%edx
0x00000000004004fa <+36>: mov -0x1c(%rbp),%eax
0x00000000004004fd <+39>: add %eax,%edx
0x00000000004004ff <+41>: mov -0x20(%rbp),%eax
0x0000000000400502 <+44>: add %eax,%edx
0x0000000000400504 <+46>: mov -0x24(%rbp),%eax
0x0000000000400507 <+49>: add %eax,%edx
0x0000000000400509 <+51>: mov -0x28(%rbp),%eax
0x000000000040050c <+54>: add %eax,%edx
0x000000000040050e <+56>: mov 0x10(%rbp),%eax
0x0000000000400511 <+59>: add %eax,%edx
0x0000000000400513 <+61>: mov 0x18(%rbp),%eax
0x0000000000400516 <+64>: add %edx,%eax
0x0000000000400518 <+66>: mov %eax,-0xc(%rbp)
8 int ave = sum / 8;
0x000000000040051b <+69>: mov -0xc(%rbp),%eax
0x000000000040051e <+72>: lea 0x7(%rax),%edx
0x0000000000400521 <+75>: test %eax,%eax
0x0000000000400523 <+77>: cmovs %edx,%eax
0x0000000000400526 <+80>: sar $0x3,%eax
0x0000000000400529 <+83>: mov %eax,-0x8(%rbp)
9 int count = foo2(sum, ave);
0x000000000040052c <+86>: mov -0x8(%rbp),%edx
0x000000000040052f <+89>: mov -0xc(%rbp),%eax
0x0000000000400532 <+92>: mov %edx,%esi
0x0000000000400534 <+94>: mov %eax,%edi
0x0000000000400536 <+96>: callq 0x400543
0x000000000040053b <+101>: mov %eax,-0x4(%rbp)
10 return count;
0x000000000040053e <+104>: mov -0x4(%rbp),%eax
11 }
0x0000000000400541 <+107>: leaveq
0x0000000000400542 <+108>: retq
End of assembler dump.
As the assembly shown above, arguments a, b, c, d, e, f are restored in registers, and g, h are pushed into stack. Here, foo1 is not a leaf function, thus the GCC did not make the optimization of foo1 by using "red zone".
Here's the stack frame of main calling foo1
foo2
(gdb) disassemble /m foo2
Dump of assembler code for function foo2:
13 int foo2(int a, int b) {
0x0000000000400543 <+0>: push %rbp
0x0000000000400544 <+1>: mov %rsp,%rbp
0x0000000000400547 <+4>: mov %edi,-0x14(%rbp)
0x000000000040054a <+7>: mov %esi,-0x18(%rbp)
14 int g = a / b;
0x000000000040054d <+10>: mov -0x14(%rbp),%eax
0x0000000000400550 <+13>: cltd
0x0000000000400551 <+14>: idivl -0x18(%rbp)
0x0000000000400554 <+17>: mov %eax,-0x4(%rbp)
15 return g;
0x0000000000400557 <+20>: mov -0x4(%rbp),%eax
16 }
0x000000000040055a <+23>: pop %rbp
0x000000000040055b <+24>: retq
End of assembler dump.
Here comes the beginning of this article, RSP was not modified in foo2 although there's a local variable in foo2. Because in foo2, GCC choose to use the "red zone" to store local variable g. Here's the stack frame of foo2:
Conclusion
Function call in x64 differs a lot from that in x86. The "red zone" is an optimization for leaf functions to avoid adjusting the stack pointer in the prologue and epilogue. x64 has some interesting quirks, which aren't obvious even if you're familiar with 32-bit x86:)