Page Tables: The Bureaucracy Of Virtual Memory


Yesterday we met the MMU, the border guard of memory.

Today we inspect the paperwork the guard reads:

page tables.

If the MMU is the checkpoint,

page tables are the registry office.

They say which virtual pages exist, where they go, and what crimes they are allowed to commit.

I. The Simple Lie

A virtual page maps to a physical frame.

That is the simple story.

virtual page 0x12345 -> physical frame 0xabcde
offset stays the same

But a real address space may contain huge regions:

  • executable code
  • shared libraries
  • heap
  • stack
  • memory-mapped files
  • kernel mappings
  • guard pages
  • device mappings
  • empty holes waiting to catch fools

A flat table for all possible virtual pages would be enormous.

So modern architectures use multi-level page tables.

Translation becomes a bureaucratic walk through offices.

II. Four-Level x86-64 Paging

Common x86-64 paging uses four levels:

flowchart LR
    VA["virtual address"]
    PML4["PML4"]
    PDPT["PDPT"]
    PD["Page Directory"]
    PT["Page Table"]
    PAGE["4 KiB physical page"]

    VA --> PML4 --> PDPT --> PD --> PT --> PAGE

For 4 KiB pages, the virtual address is split into index fields plus an offset.

Conceptually:

virtual address bits:
  [ PML4 ][ PDPT ][ PD ][ PT ][ page offset ]
      9      9      9     9        12

Nine bits select one of 512 entries at each level.

Twelve bits select a byte inside the 4 KiB page.

The numbers are not decoration.

They are why page-table pages are 4 KiB and hold 512 eight-byte entries.

The Ministry loves symmetry when it can tax it.

III. CR3: The Address-Space Passport

On x86, the CR3 register points to the root of the active page-table hierarchy.

When the operating system switches from one process to another, it can switch address spaces by changing the page-table root.

Conceptually:

; simplified: load new page-table root
mov cr3, rax

Do not paste this into your shell.

This is the kind of instruction that belongs to kernels, hypervisors, boot code, and people who already know which machine they are about to ruin.

Register / structureJob
CR3root pointer for current address-space translations
PML4top-level page map in 4-level paging
PTEfinal entry for a 4 KiB page
physical frameactual memory backing the page

Every process receives a different map.

The physical RAM is shared.

The lies are personalized.

IV. Page-Table Entries

A page-table entry does not merely point to memory.

It carries policy.

Typical x86 page-entry concepts include:

Bit / fieldMeaning
presenttranslation exists
read/writewrites allowed if set
user/supervisoruserland access allowed if set
accessedhardware saw the page
dirtyhardware saw a write
page sizelarge page at this level
globalavoid flushing across some context switches
NXno-execute, if supported/enabled

A page entry is a tiny dictatorship:

physical frame address + permission bits = law

The OS writes the law.

The MMU enforces it.

The process complains on social media.

V. Huge Pages

Not every mapping has to end at a 4 KiB page.

x86-64 can use larger pages such as 2 MiB and 1 GiB, depending on mode and support.

Page sizeWhy use itCost
4 KiBfine-grained protection and allocationmany entries
2 MiBfewer translations, better TLB reachinternal fragmentation
1 GiBhuge mapping efficiencycoarse and expensive to reserve

Huge pages are useful for databases, hypervisors, large memory workloads, and anything that wants fewer translation entries.

But huge pages are not free magic.

They trade precision for fewer bureaucrats.

The dictator approves in principle.

The memory allocator files objections.

VI. Copy-On-Write

Page tables make fork() elegant.

When a Unix process forks, the kernel does not need to copy all memory immediately.

It can map the same physical pages into both processes as read-only and mark them copy-on-write.

When one process tries to write, a page fault occurs. The kernel then copies that page and updates the writer’s page table.

sequenceDiagram
    participant P as Parent
    participant K as Kernel
    participant C as Child
    participant M as Physical page

    P->>K: fork()
    K->>P: map page read-only COW
    K->>C: map same page read-only COW
    C->>M: write attempt
    M-->>K: page fault
    K->>C: allocate private copy and resume

This is the kind of trick that makes operating systems feel civilized.

Nobody copies what may never be modified.

The state delays work until a citizen commits a write crime.

VII. Kernel Mappings

Many operating systems map kernel memory into every process address space, protected by supervisor-only permissions.

This makes system calls and interrupts efficient because the kernel is already mapped when control changes privilege.

But speculative execution attacks changed the comfort level.

After Meltdown-class issues, many systems adopted stronger kernel page-table isolation techniques so user mode would not keep useful kernel mappings visible in the same old way.

This is the hardware lesson:

permissions are not only logical.

Microarchitecture can make forbidden knowledge observable through side channels.

The page table said no.

The cache whispered yes.

VIII. The Real Story (Suppressed)

Officially, PTE means Page Table Entry.

Suppressed expansion:

Permission To Exist.

If the present bit is clear, the page does not exist.

If writable is clear, the page may not be modified.

If NX is set, the page may not execute.

The page asks:

“May I live?”

The PTE replies:

“Only as data.”

This is why the kernel is the Supreme Bureaucrat.

It does not allocate memory.

It issues visas.

IX. The Lesson

Page tables are not a side detail.

They are the operating system’s map of reality.

They enable:

  • process isolation
  • memory permissions
  • demand paging
  • copy-on-write
  • memory-mapped files
  • kernel/user separation
  • huge pages
  • virtualization foundations

But walking them is expensive.

Tomorrow we inspect the cache that prevents every memory access from becoming a committee meeting:

the TLB.

— Kim Jong Rails, Supreme Leader of the Republic of Derails