Executable File Formats: ELF, PE, and Mach-O Explained

In Part 3 of this series, we examined how functions organize themselves on the stack. But before a program can […]

Last Updated on February 14, 2026 by Vivekanand

In Part 3 of this series, we examined how functions organize themselves on the stack. But before a program can even make its first function call—or even execute its first instruction—it exists as a static file on your disk.

How does the operating system know how to take that file and turn it into a running process? The answer lies in the Executable File Format.

In this guide (Part 4), we’ll dissect the three major executable file formats that power our world: ELF (Linux/BSD), PE (Windows), and Mach-O (macOS). Understanding these containers is the prerequisite to understanding program startup, dynamic linking, and reverse engineering.


The “Container” Analogy for Executable Files

Think of an executable file format not as a linear list of instructions, but as a complex shipping container.

  • Headers: The manifest taped to the outside (Architecture, Entry Point, Required OS).
  • Segments: The big cargo sections loaded into memory (Code vs. Data).
  • Sections: The categorized items inside the cargo (Read-only strings, Global variables, Debug info).

The OS Loader’s job is to read the manifest, allocate memory, and unpack the cargo exactly where it needs to go.

graph TD
    subgraph ELF [ELF File Layout]
        E1[ELF Header]
        E2[Program Header Table]
        E3[.text Section]
        E4[.data Section]
        E5[Section Header Table]
    end

    subgraph PE [PE File Layout]
        P1[DOS Header]
        P2[PE Header]
        P3[Section Table]
        P4[.text Section]
        P5[.data Section]
    end

    subgraph MACHO [Mach-O File Layout]
        M1[Mach Header]
        M2[Load Commands]
        M3[__TEXT Segment]
        M4[__DATA Segment]
    end

    style E1 fill:#2d3436,stroke:#74b9ff,stroke-width:2px
    style P1 fill:#2d3436,stroke:#ff7675,stroke-width:2px
    style M1 fill:#2d3436,stroke:#55efc4,stroke-width:2px
Schematic Diagram: Executable File on Disk vs Memory Mapping
From Disk to RAM: How the OS Loader maps an executable file into virtual memory.

The Big Three Executable File Formats

FormatFull NameOS FamilyOrigins
ELFExecutable and Linkable FormatLinux, Android, BSD, Consoles (PS4/5)Unix System V
PEPortable ExecutableWindows, UEFIVAX/VMS (COFF)
Mach-OMach ObjectmacOS, iOS, watchOSNeXTSTEP / Mach Kernel

Although they differ in binary layout, all these executable file formats solve the exact same problems:

  1. Where does execution start? (Entry Point)
  2. What memory needs to be allocated? (Segments)
  3. What external code is needed? (Dynamic Linking)

1. ELF: The Universal Standard

ELF is the workhorse of the Unix world. It’s clean, flexible, and extremely well-documented.

Structure

An ELF file consists of:

  1. ELF Header: Magic bytes (0x7F 'E' 'L' 'F'), architecture info (x86-64/ARM64), and the Entry Point Address (virtual address where _start usually lives).
  2. Program Header Table (Segments): Instructions for the OS Loader. It says “Take bytes 0x1000 to 0x2000 from the file and map them to address 0x400000 with Read/Execute permissions.”
  3. Section Header Table (Sections): Metadata for the Linker. It defines logical groupings like .text (code) or .data (variables).

Key Inspection Tools (Linux)

# View the Header (Entry Point is key!)
readelf -h /bin/ls

# View Segments (Loader View)
readelf -l /bin/ls

# View Sections (Linker View)
readelf -S /bin/ls

2. PE: The Windows Legacy

Portable Executable (PE) format is actually a modified version of the ancient COFF (Common Object File Format).

Structure

PE headers are notoriously complex because they preserve backward compatibility all the way back to DOS.

  1. DOS Header: Starts with MZ (Mark Zbikowski). It includes a tiny DOS program that prints “This program cannot be run in DOS mode.”
  2. PE Header: Starts with PE. Contains the “ImageBase” (preferred load address) and “AddressOfEntryPoint”.
  3. Data Directories: A critical array pointing to Import Tables, Export Tables, and Resources (icons, menus).
  4. Section Table: Defines .text, .data, .rsrc (resources).

Key Inspection Tools (Windows)

:: Using built-in dumpbin (Visual Studio Command Prompt)
dumpbin /headers myapp.exe

:: 3rd Party Recommendation
:: "PE-Bear" is an excellent visual tool for exploring PE structures.

3. Mach-O: The Apple Way

Mach-O (Mach Object) reflects its NeXTSTEP heritage. It organizes data into “Load Commands” rather than fixed tables.

Structure

  1. Mach Header: Magic bytes (0xFEEDFACF for 64-bit), CPU type (ARM64/x86-64).
  2. Load Commands: A variable-length list of commands that tell the kernel what to do.
  3. LC_SEGMENT_64: Map a segment into memory.
  4. LC_MAIN: Specifies the entry point (modern replacement for LC_UNIXTHREAD).
  5. LC_LOAD_DYLIB: Load a dynamic library (like libSystem.B.dylib).

Key Inspection Tools (macOS)

# View Header
otool -hv /bin/ls

# View Load Commands
otool -l /bin/ls

Comparison: The Critical Sections

No matter the format, you will always find these three fundamental regions, just named differently:

ConceptELF NamePE NameMach-O NamePurpose
Executable Code.text.text__TEXT,__textYour compiled assembly instructions. Read-Only + Executable.
Global Variables.data.data__DATA,__dataInitialized global/static variables (int x = 5;). Read/Write.
Uninitialized Data.bss.bss*__DATA,__bssGlobal variables that are zero (int x;). Takes no space on disk, only in memory.
Read-Only Data.rodata.rdata__TEXT,__constConstants and string literals.

* Note: PE often merges BSS into other sections or handles it via VirtualSize > SizeOfRawData.


Why This Matters for “Program Startup”

We are analyzing these formats for one specific reason: to understand Program Startup.

When you run a program, the kernel:

  1. Reads the Header to ensure it supports the architecture.
  2. Maps the Segments into virtual memory based on the tables we just saw.
  3. Calculates the Entry Point (applying ASLR randomization if needed).
  4. Jumps to that Entry Point address.

That exact moment—the jump to the Entry Point—is where the “Executable Format” job ends and the Program Startup code begins.

In Part 5, we will start execution at that exact instruction (often called _start) and trace the fascinating journey that happens before your main() function ever gets called. If you want to review how arguments are passed during this process, check out Part 2: Calling Conventions.


References & Further Reading

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top