Inside the Linux kernel's engine room, there’s a quiet power struggle—how memory is safely handed out to virtual devices. For years, the kernel’s Intel VT-d IOMMU (Input–Output Memory Management Unit) code unintentionally permitted a rarely-used “write-only” mode for certain page-table entries. While this might sound harmless, this mismatched approach could create unpredictable, confusing bugs if exploited. The kernel patch CVE-2021-47035 addresses this by cleaning up permissions for memory mapping. Let's break down what happened, why it’s important, and how the kernel has fixed it.
The Confusion
The Linux VT-d driver sometimes allowed Write-Only (WO) mapping for devices in “second-level” paging, thanks to a quirk in permissions handling. However, VT-d's hardware logic always treats a "present" memory page as at least readable.
This meant WO mappings would sometimes "work" in one paging mode, but crash or behave oddly in another. The kernel should not allow mapping memory as “write-only” when hardware will ignore that restriction.
First-level translation (Device Address → Physical Address): Only RO and RW, no WO allowed.
- Second-level (Guest OS): The code let you set write-only mappings, which isn't supported by the hardware — the read bit always has to be set if present.
Why does this really matter?
- Security: The mismatch may be leveraged to confuse memory access controls, possibly creating attack windows.
- Stability: Applications or drivers expecting “write-only” to function might find their code breaks, or worse, leaks information.
The Patch: Banning “Write-Only” for Good
Here’s the original fix commit:
static u64 intel_get_iommu_access_flags(int prot)
{
    u64 ret = ;
    if (prot & IOMMU_WRITE)
        ret |= DMA_PTE_WRITE; // Set write flag
    if (prot & IOMMU_READ)
        ret |= DMA_PTE_READ; // Set read flag
    return ret;
}
// Remove handling for IOMMU_WRITE only without READ!
Before, it was possible to provide flags so only IOMMU_WRITE was set and still create a page entry. The new code ensures the read bit is always present if the page is present and blocks the “write-only” mapping.
Bad Mapping Scenario
Suppose a device driver sets up a page as “write-only” using VT-d, intending to drop read access for extra security:
dma_map_page(dev, page, DMA_TO_DEVICE, /* Write-only flag set */);
With this vulnerability, this might appear to succeed in some configurations, but the hardware actually allows reads regardless.
Exploit Vector
An attacker with enough privileges might engineer memory mappings where supposedly “write-only” buffers are actually readable in some contexts, defeating attempts to sandbox or limit DMA. Imagine a buffer meant for output could now be snooped by a device or malicious guest, leaking secrets.
However, while the opportunity is limited (local attacker, needing IOMMU mapping control), it’s real for virtualized or containerized environments.
Solution and Recommendation
- Updated kernels now reject or ignore write-only mappings, requiring at least the “read” flag when pages are present.
- Devices, virtual machines, and drivers should be updated to avoid relying on “write-only” and always expect read permission with write.
Are You Affected?
- Kernel 5.13 and newer include the fix. See Linux stable notes.
- Distributions: Check your distro’s security tracker. Example: Debian Security Tracker
References
- CVE-2021-47035 on NVD
- Kernel git commit (Torvalds tree)
- LKML Patch Discussion
- IOMMU Documentation
Conclusion
CVE-2021-47035 tidies up a subtle but crucial hole in the Linux kernel’s VT-d memory permissions. By banning the accidental “write-only” mode, it aligns kernel logic with actual hardware, blocking a niche but real path for escalating bugs and data leaks. If you work close to the kernel or low-level drivers, keep your kernel up to date — and remember: sometimes, less permission choice means less trouble.
Stay secure, and happy hacking!
*(If you found this post helpful, bookmark the references or follow kernel changelogs for more deep dives like this.)*
Timeline
Published on: 02/28/2024 09:15:39 UTC
Last modified on: 01/09/2025 15:15:16 UTC
