x86-64 vs ARM64 Instruction Map: Side-by-Side Assembly Reference
This x86-64 vs ARM64 instruction map shows the most common assembly instructions on both dominant 64-bit architectures side by side. Use it as a translation table when reading disassembly, porting code, or studying compiler output across platforms.
This x86-64 vs ARM64 instruction map shows the most common assembly instructions on both dominant 64-bit architectures side by side. Use it as a translation table when reading disassembly, porting code, or studying compiler output across platforms. Each row shows the equivalent operation with key architectural differences noted.
For hands-on tutorials, see my Assembly Syscall Deep Dive (Linux/macOS, both architectures) and Calling Conventions Demystified.
1. Data Movement
The biggest difference in data movement is that ARM64 is a load-store architecture — arithmetic instructions cannot access memory directly. x86-64 allows memory operands in most instructions, enabling direct manipulation of data at memory addresses.
| Operation | x86-64 | ARM64 | Notes |
|---|---|---|---|
| Register → Register | mov rax, rbx | mov x0, x1 | |
| Immediate → Register | mov rax, 42 | mov x0, #42 | ARM64: 16-bit per movz/movk |
| Load from memory | mov rax, [rbp-8] | ldr x0, [x29, #-8] | ARM64: separate load instruction |
| Store to memory | mov [rbp-8], rax | str x0, [x29, #-8] | ARM64: separate store instruction |
| Load effective address | lea rax, [rip+label] | adr x0, label | PC-relative; ARM64 also has adrp |
| Zero-extend byte load | movzx eax, byte [rbp-1] | ldrb w0, [x29, #-1] | |
| Sign-extend byte load | movsx rax, byte [rbp-1] | ldrsb x0, [x29, #-1] | |
| Push to stack | push rax | str x0, [sp, #-16]! | ARM64 often uses stp for pairs |
| Pop from stack | pop rax | ldr x0, [sp], #16 | ARM64 often uses ldp for pairs |
| Conditional move | cmovz rax, rbx | csel x0, x1, x0, eq | ARM64 csel selects between two registers |
| Load pair | — (no equivalent) | ldp x0, x1, [sp] | ARM64 can load/store two registers at once |
2. Arithmetic
x86-64 typically uses a two-operand form (where the destination is also a source), while ARM64 uses a three-operand form (separate destination, source1, source2), allowing for more expressive single instructions.
| Operation | x86-64 | ARM64 | Notes |
|---|---|---|---|
| Add | add rax, rbx | add x0, x0, x1 | ARM64: 3-operand (dst, src1, src2) |
| Add with carry | adc rax, rbx | adc x0, x0, x1 | |
| Subtract | sub rax, rbx | sub x0, x0, x1 | |
| Subtract with borrow | sbb rax, rbx | sbc x0, x0, x1 | |
| Multiply (low result) | imul rax, rbx | mul x0, x1, x2 | x86-64 two-operand form; result truncated |
| Multiply high | mul rbx → RDX:RAX | umulh x0, x1, x2 | x86-64: unsigned (mul) and signed (imul) |
| Multiply-accumulate | — (no single instruction) | madd x0, x1, x2, x3 | x0 = x1 × x2 + x3 |
| Divide (signed) | idiv rbx | sdiv x0, x1, x2 | x86-64: divides RDX:RAX, result in RAX |
| Divide (unsigned) | div rbx | udiv x0, x1, x2 | x86-64: remainder in RDX |
| Negate | neg rax | neg x0, x1 | |
| Increment | inc rax | add x0, x0, #1 | ARM64 has no dedicated inc/dec |
| Decrement | dec rax | sub x0, x0, #1 |
3. Logic & Bitwise Operations
ARM64 includes efficient bit-manipulation instructions like bic (bit clear) and clz (count leading zeros) as single operations.
| Operation | x86-64 | ARM64 | Notes |
|---|---|---|---|
| Bitwise AND | and rax, rbx | and x0, x0, x1 | |
| Bitwise OR | or rax, rbx | orr x0, x0, x1 | ARM64 mnemonic is orr |
| Bitwise XOR | xor rax, rbx | eor x0, x0, x1 | ARM64 mnemonic is eor |
| Bitwise NOT | not rax | mvn x0, x1 | ARM64: MoVe Not |
| Bit clear (AND NOT) | andn rax, rbx, rcx | bic x0, x1, x2 | x86-64: rax = (~rbx) AND rcx (BMI1) |
| Logical shift left | shl rax, cl | lsl x0, x1, x2 | |
| Logical shift right | shr rax, cl | lsr x0, x1, x2 | |
| Arithmetic shift right | sar rax, cl | asr x0, x1, x2 | Preserves sign bit |
| Rotate right | ror rax, cl | ror x0, x1, x2 | |
| Count leading zeros | lzcnt rax, rbx | clz x0, x1 | x86-64 requires LZCNT extension |
| Test bit + branch | bt rax, 5 + jc label | tbnz x0, #5, label | x86-64: two instructions; ARM64: fused |
4. Comparison & Branching
ARM64’s cbz and cbnz instructions combine compare-with-zero and branch into a single atomic operation, reducing instruction count in common loops and checks.
| Operation | x86-64 | ARM64 | Notes |
|---|---|---|---|
| Compare | cmp rax, rbx | cmp x0, x1 | Sets flags (SUB without storing) |
| Test | test rax, rbx | tst x0, x1 | AND without storing |
| Branch if equal | je label | b.eq label | |
| Branch if not equal | jne label | b.ne label | |
| Branch if less (signed) | jl label | b.lt label | |
| Branch if greater (signed) | jg label | b.gt label | |
| Branch if ≤ (signed) | jle label | b.le label | |
| Branch if ≥ (signed) | jge label | b.ge label | |
| Branch if below (unsigned) | jb label | b.lo label | Carry flag / unsigned |
| Branch if above (unsigned) | ja label | b.hi label | |
| Unconditional jump | jmp label | b label | |
| Branch if zero | test rax, rax + je label | cbz x0, label | ARM64 fuses test+branch |
| Branch if nonzero | test rax, rax + jne label | cbnz x0, label | |
| Call function | call func | bl func | ARM64: return address in X30 (LR) |
| Return | ret | ret | x86-64 pops from stack; ARM64 uses X30 |
5. System Interface
System calls use different instructions and register conventions. ARM64 also provides explicit spin-loop hints like yield.
| Operation | x86-64 | ARM64 | Notes |
|---|---|---|---|
| System call | syscall | svc #0 | Syscall # in RAX vs X8 |
| Breakpoint | int3 | brk #0 | Used by debuggers |
| No operation | nop | nop | |
| Memory barrier | mfence | dmb sy | ARM64 has finer-grained barriers |
| Halt / Yield | pause | yield | Spin-loop hint |
Key Architectural Differences
Understanding these fundamental differences helps when reading this x86-64 ARM64 instruction map and translating between architectures:
Fixed vs. Variable-Length Instructions: ARM64 uses fixed 32-bit (4-byte) instructions, making disassembly trivial — every instruction starts at a 4-byte boundary. x86-64 instructions vary from 1 to 15 bytes, requiring sequential decoding from a known starting point.
Load-Store Architecture: ARM64 arithmetic operates only on registers. To add a value from memory, you must first ldr it into a register, then add. x86-64 can embed memory operands directly: add rax, [rbp-8].
Link Register vs. Stack Return: On ARM64, bl (branch with link) stores the return address in X30 (LR), and ret branches to X30. On x86-64, call pushes the return address onto the stack, and ret pops it. This means ARM64 leaf functions don’t need to save LR or set up a stack frame, though they may still use stack space for local variables.
Condition Codes: Both architectures set flags after certain operations, but ARM64 requires explicit flag-setting variants (adds, subs) for arithmetic — plain add/sub do not set flags. On x86-64, most arithmetic instructions always update RFLAGS.
PC-Relative Addressing: ARM64’s adrp/add pair can reach any address within ±4 GB of the current PC, while x86-64’s RIP-relative addressing reaches ±2 GB. Both are used for position-independent code.
How to Use This Map
This comparison maps the most common assembly instructions between the two dominant 64-bit architectures side by side. Use it as a translation table when:
- Reading disassembly: Identify equivalent logic across architectures.
- Porting code: Rewrite core logic for a different platform.
- Studying compiler output: Understand how different backends lower the same high-level code.
Keep in mind that some mappings are not 1:1 — x86-64’s CISC heritage means a single instruction (like push) may require two ARM64 instructions (a subtract and a store). Conversely, ARM64 offers fused operations like madd (multiply-accumulate) and cbz (compare-and-branch) that require multiple x86-64 instructions.
Related Resources
Deep Dives
- Assembly Hello World: Cross-Platform Syscall Tutorial
- Calling Conventions Demystified
- Stack Frames & Function Prologues