Last Updated on March 16, 2026 by Vivekanand

Understanding stack frame prologue epilogue patterns is essential for anyone working with low-level code, debugging, or reverse engineering. In Part 2 of this series, we explored calling conventions — the contracts that govern how functions pass arguments and preserve registers. Now we’ll dive into stack frames themselves.
Every function call creates a stack frame — a dedicated workspace on the stack that holds local variables, saved registers, and the breadcrumb trail back to the caller. The function prologue sets up this frame at the start of a function, while the epilogue tears it down before returning. These are the bookends of every function call.
This comprehensive guide covers stack frame prologue and epilogue patterns across:
| Architecture | Platforms | ABI |
|---|---|---|
| x86-64 | Linux, macOS | System V AMD64 |
| x86-64 | Windows | Microsoft x64 |
| ARM64 | Linux, macOS, Windows | AAPCS64 |
Table of Contents
What Is a Stack Frame?
A stack frame (also called an activation record) is a contiguous block of memory on the stack allocated for a single function invocation. It contains:
- Return address — Where to resume execution after the function returns
- Saved frame pointer — The caller’s frame pointer (if frame pointers are used)
- Saved registers — Callee-saved registers that this function modifies
- Local variables — Function-local data
- Spill slots — Temporary storage for register values
- Outgoing arguments — Arguments for functions this function calls (stack-passed)
x86-64 Stack Frame Prologue Epilogue (System V ABI)
The System V AMD64 ABI is used on Linux, macOS, FreeBSD, and most Unix-like systems. Here’s how a typical stack frame is constructed:
Standard Prologue
push rbp ; Save caller's frame pointer
mov rbp, rsp ; Set up new frame pointer
sub rsp, N ; Allocate N bytes for locals
This is the classic frame pointer-based prologue. After execution:
RBPpoints to the saved frame pointer[RBP+8]contains the return address[RBP-8]and below hold local variables
Standard Epilogue
mov rsp, rbp ; Deallocate locals
pop rbp ; Restore caller's frame pointer
ret ; Return to caller
Or using the leave instruction:
leave ; Equivalent to: mov rsp, rbp; pop rbp
ret
Saving Callee-Saved Registers
If a function uses callee-saved registers (RBX, R12-R15), they must be preserved:
; Prologue
push rbp
mov rbp, rsp
push rbx ; Save callee-saved registers
push r12
push r13
sub rsp, 32 ; Allocate locals (keep 16-byte aligned)
; ... function body ...
; Epilogue
add rsp, 32
pop r13
pop r12
pop rbx
pop rbp
ret
Red Zone Optimization
The System V ABI provides a 128-byte red zone below RSP that leaf functions can use without adjusting the stack pointer:
; Leaf function using red zone - no prologue needed!
mov [rsp-8], rdi ; Store in red zone
mov [rsp-16], rsi
; ... compute ...
ret
Important: The red zone is only safe for leaf functions. Signal handlers and interrupts will clobber it.
x86-64 Stack Frame Prologue Epilogue (Windows x64)
Windows uses a different ABI with several unique requirements:
Shadow Space Requirement
Every function must reserve 32 bytes of shadow space for the first four register arguments, even if not used. Important: this shadow space is actually allocated by the caller in its own frame — before executing the call instruction — and it sits above the callee’s return address. The callee may optionally use it to spill its incoming register arguments (RCX, RDX, R8, R9) for debugging or varargs purposes. The key takeaway: the responsibility for allocating shadow space lies with the caller, not the callee.
; Windows x64 Prologue
push rbp
mov rbp, rsp
sub rsp, 48 ; 32 bytes shadow + 16 bytes locals (aligned)
Callee-Saved Registers
Windows has more callee-saved registers: RBX, RBP, RDI, RSI, R12-R15, and XMM6-XMM15:
; Saving XMM registers on Windows
sub rsp, 32
movaps [rsp], xmm6
movaps [rsp+16], xmm7
No Red Zone
Windows does not have a red zone. You must always adjust RSP before storing anything below it.
ARM64 Stack Frame Prologue Epilogue (AAPCS64)
ARM64 follows the AAPCS64 (Procedure Call Standard for ARM 64-bit Architecture). The prologue/epilogue patterns differ significantly from x86-64:
Standard Prologue
stp x29, x30, [sp, #-16]! ; Save FP and LR, pre-decrement SP
mov x29, sp ; Set up frame pointer
sub sp, sp, #N ; Allocate N bytes for locals
Key differences from x86-64:
X29is the frame pointer (equivalent to RBP)X30(LR) holds the return address (not pushed by call)STPstores a pair of registers efficiently- Pre-indexed addressing
[sp, #-16]!combines store and decrement
Standard Epilogue
add sp, sp, #N ; Deallocate locals
ldp x29, x30, [sp], #16 ; Restore FP and LR, post-increment SP
ret ; Return (uses X30)
Saving Callee-Saved Registers
ARM64 callee-saved registers are X19-X28 and D8-D15 (SIMD). They’re typically saved in pairs:
; Prologue with callee-saved registers
stp x29, x30, [sp, #-48]! ; Save FP, LR
mov x29, sp
stp x19, x20, [sp, #16] ; Save callee-saved pairs
stp x21, x22, [sp, #32]
; ... function body ...
; Epilogue
ldp x21, x22, [sp, #32]
ldp x19, x20, [sp, #16]
ldp x29, x30, [sp], #48
ret
Apple ARM64 Variations
Apple’s ARM64 ABI has some specific requirements:
- Pointer Authentication (PAC) — Return addresses may be signed
- BTI — Branch Target Identification for control flow integrity
; Apple ARM64 with PAC
pacibsp ; Sign return address
stp x29, x30, [sp, #-16]!
mov x29, sp
; ... function body ...
ldp x29, x30, [sp], #16
retab ; Authenticate and return
Frame Pointer Omission (FPO)
With optimizations enabled, compilers often omit the frame pointer to free up a register:
x86-64 Without Frame Pointer
; No frame pointer - RBP available for general use
sub rsp, 40 ; Allocate stack space
mov [rsp+8], rbx ; Save callee-saved if needed
; ... use RBP as general register ...
mov rbx, [rsp+8]
add rsp, 40
ret
This makes debugging harder but provides an extra register. Use -fno-omit-frame-pointer to preserve frame pointers.
ARM64 Without Frame Pointer
; Leaf function - no frame setup needed
stp x19, x20, [sp, #-16]! ; Save if using these registers
; ... function body using X29 freely ...
ldp x19, x20, [sp], #16
ret
Stack Alignment Requirements
Proper stack alignment is mandatory for correct execution:
| Platform | Alignment | When Required |
|---|---|---|
| x86-64 System V | 16-byte | Before CALL instruction |
| x86-64 Windows | 16-byte | Before CALL instruction |
| ARM64 | 16-byte | Always (SP must be aligned) |
Misalignment causes crashes on ARM64 and performance penalties (or crashes with SSE) on x86-64.
Practical Debugging: Walking the Stack
Understanding stack frame prologue epilogue patterns is essential for debugging. Here’s how to walk a stack manually:
x86-64 Stack Walk
// Walk the stack using frame pointers
void **frame = (void **)__builtin_frame_address(0);
while (frame) {
void *return_addr = frame[1]; // [RBP+8]
printf("Return address: %pn", return_addr);
frame = (void **)frame[0]; // Follow saved RBP
}
ARM64 Stack Walk
// Walk the stack using frame pointers
void **frame = (void **)__builtin_frame_address(0);
while (frame) {
void *return_addr = frame[1]; // Saved X30 (LR)
printf("Return address: %pn", return_addr);
frame = (void **)frame[0]; // Follow saved X29 (FP)
}
Quick Reference Table
| Feature | x86-64 System V | x86-64 Windows | ARM64 |
|---|---|---|---|
| Frame Pointer | RBP | RBP | X29 |
| Stack Pointer | RSP | RSP | SP |
| Return Address | [RBP+8] (pushed by CALL) | [RBP+8] | X30/LR (saved manually) |
| Red Zone | 128 bytes | None | None |
| Shadow Space | Not required | 32 bytes (caller allocates) | Not required |
| Alignment | 16-byte before CALL | 16-byte before CALL | 16-byte always |
What’s Next
Now that you understand stack frame prologue epilogue patterns, you can:
- Debug more effectively — Read crash dumps and walk stacks manually
- Write correct assembly — Ensure proper register preservation and alignment
- Reverse engineer binaries — Identify function boundaries and local variables
- Understand compiler output — See how optimizations affect the generated code
In Part 4, we’ll explore program startup: what happens before main() runs, how the C runtime initializes, and where command-line arguments come from.
Experiment: Use Godbolt to compile the same function with different optimization levels (-O0, -O1, -O2, -O3) and watch how the stack frame prologue and epilogue evolve.

