DMA And IOMMU: When Devices Touch Memory Directly
Yesterday we studied interrupts, the emergency hotline of the CPU.
Today the device stops calling.
It enters the archive itself.
This is DMA, Direct Memory Access.
DMA lets devices read and write memory without the CPU manually copying every byte.
This is essential for performance.
It is also horrifying.
I. Why DMA Exists
Imagine a network card receiving packets.
Without DMA, the CPU would copy data from the device one byte or word at a time.
This is wasteful.
With DMA, the driver prepares buffers in memory and tells the device:
“Put packets there.”
flowchart LR
NIC["network card"]
RAM["RAM buffer"]
CPU["CPU"]
IRQ["interrupt when done"]
CPU -->|program DMA descriptors| NIC
NIC -->|DMA write packet data| RAM
NIC -->|interrupt| IRQ --> CPU
The CPU becomes a manager.
The device becomes a worker with archive access.
Every security engineer now sits upright.
II. Scatter-Gather
Devices often use descriptor rings.
The driver builds a list of buffers.
The device walks the descriptors and performs transfers.
struct dma_desc {
uint64_t address;
uint32_t length;
uint32_t flags;
};
The address is not just a number.
It is authority.
If the device can DMA to arbitrary physical memory, it can overwrite the kernel, steal secrets, or modify user processes without asking the CPU’s normal permission machinery.
The MMU protects CPU memory accesses.
Classic DMA bypasses that unless another guard exists.
III. The IOMMU
The IOMMU is the MMU’s cousin for devices.
Intel calls its common PC technology VT-d.
AMD has its IOMMU architecture.
The idea:
device DMA addresses are translated and permission-checked before reaching physical memory.
flowchart LR
DEV["PCIe device"]
DMA["DMA address"]
IOMMU["IOMMU / DMA remapper"]
TABLES["I/O page tables"]
RAM["physical memory"]
FAULT["DMA fault"]
DEV --> DMA --> IOMMU
IOMMU --> TABLES
TABLES --> IOMMU
IOMMU -->|allowed| RAM
IOMMU -->|blocked| FAULT
Now devices receive borders too.
The network card may access packet buffers.
It may not rewrite the kernel’s throne room.
IV. Why Virtualization Needed It
Virtual machines make DMA more complicated.
A guest OS thinks it owns “physical” memory.
It does not.
It owns guest-physical memory, which the hypervisor maps to host physical memory.
If a device assigned to a VM performs DMA, the platform must ensure it reaches only that VM’s memory.
That is one of the IOMMU’s major jobs.
| Without IOMMU | With IOMMU |
|---|---|
| device can target host physical memory | device constrained by I/O page tables |
| PCI passthrough unsafe | PCI passthrough becomes practical |
| malicious device can scribble broadly | DMA faults and isolation possible |
| guest address confusion | remapping translates guest/device views |
The IOMMU is why PCI passthrough is not simply handing a knife to a prisoner.
It is handing a knife through a fence with cameras.
V. Thunderbolt And External DMA
External high-speed buses made DMA attacks famous.
Thunderbolt exposes PCIe-like capabilities outside the chassis. That is powerful and dangerous.
A malicious device with DMA capability can attack memory if protections are weak or disabled.
Modern systems use IOMMU-based DMA protection, security levels, authorization, and firmware policies to reduce this risk.
The lesson is older than Thunderbolt:
any device that can bus-master memory is not a peripheral.
It is a small government with a write permit.
VI. Bounce Buffers And Old Hardware
Not every device can address all memory.
Old or limited devices may only DMA to certain address ranges or alignments. Operating systems use bounce buffers: temporary memory areas that devices can reach, then copy data to the real destination.
device -> bounce buffer -> final memory
This is slower.
It is also the kind of ugly compromise that keeps old hardware alive long after dignity has left.
VII. DMA Coherency
DMA interacts with CPU caches.
If the CPU has cached a buffer and a device writes to RAM, the CPU must not keep reading stale cache lines.
If the CPU writes data for a device to read, the device must see the correct contents.
Architectures and platforms differ in how coherent DMA is and what cache maintenance drivers must perform.
Driver writers must know the rules.
“It worked on my machine” is not a memory-ordering model.
VIII. The Real Story (Suppressed)
Officially, DMA means Direct Memory Access.
Suppressed expansion:
Device May Annex.
The first DMA controller asked:
“May I copy this block without bothering the CPU?”
The CPU said:
“Yes, but only that block.”
The device smiled in PCIe.
This is why the IOMMU was created:
not because devices were evil,
but because they were trusted like contractors with palace keys.
IX. The Lesson
DMA is essential.
Without it, modern I/O would drown the CPU in copying.
But DMA gives devices direct memory power, so platforms need IOMMUs, driver discipline, descriptor validation, cache coherency handling, and interrupt coordination.
The decree:
- DMA is performance with danger
- devices need memory borders too
- IOMMU is the MMU for hardware
- PCI passthrough depends on remapping
- external DMA is a real attack class
- driver bugs are foreign policy incidents
Tomorrow we enter a more secret mode:
SMM, Ring -2.
— Kim Jong Rails, Supreme Leader of the Republic of Derails