A critical vulnerability (CVE-2023-52474) was discovered and patched in the Linux kernel, specifically in the InfiniBand hfi1 driver, affecting how user SDMA (System Direct Memory Access) requests process multiple payload iovecs that do not end at page boundaries. This bug can lead to data corruption and potentially to memory safety vulnerabilities like unauthorized data leaks and kernel memory corruption—especially in environments that rely on high-performance networking with user-space SDMA. In this post, we break down what causes CVE-2023-52474, how the bug works under the hood, show sketches of problematic code, walk through an exploit scenario, and explain the patch.
Background: What’s User SDMA and IOVEC?
- SDMA (System Direct Memory Access): A method to move data directly between user-space memory and network hardware without CPU copying.
- iovec: In Linux, an array of memory buffers (scatter/gather list) often used for high-performance data transfer.
The hfi1 driver supports SDMA requests made up of several iovec memory areas chained together.
1. Not Respecting iovec Length (iov_len)
When processing a user SDMA request, the SDMA engine would segment data from a given iovec into SDMA packets. Each iovec describes a buffer (base address, length). However, due to a mistake, the function would sometimes read *past* the buffer size specified by iov_len and use up to a page (PAGE_SIZE, usually 4KB) for the packet, copying extra unrelated memory.
2. Not Moving to Next iovec When Needed
If the current iovec buffer did *not* end on a page boundary and did not have enough data to fill an SDMA packet, the function would sometimes not move to the next iovec entry. This ends up sending garbage data or reusing stale data from the memory page.
In both cases, an attacker can trick the kernel into reading kernel memory or another process’s data, and send it across the network.
Here’s a simplified (and slightly pseudocode) version of the problematic part
// Wrong: Fails to check iov.iov_len, uses PAGE_SIZE
int user_sdma_txadd(struct user_sdma_iovec *uiov) {
void *p = uiov->iov.iov_base;
// Mistake: reads up to PAGE_SIZE even if iov_len < PAGE_SIZE
copy_to_packet_buffer(packet, p, PAGE_SIZE);
}
The correct implementation must use only uiov->iov.iov_len
// Fixed: Uses the actual iovec length
copy_to_packet_buffer(packet, p, uiov->iov.iov_len);
mmu_rb_handler Bugs
SDMA’s memory pinning cache (mmu_rb_handler) exists to speed up and manage memory translation and pinning as the driver accesses user memory across multiple packets. Additional race condition and reference-counting bugs existed here which could result in:
Memory region structures (mmu_rb_node) being freed while other code still used their pointers
These are classic sources of use-after-free and double-free bugs in kernel code.
Data after .iov_base + iov_len could contain secrets (heap, stack, or even kernel data).
The kernel then copies up to PAGE_SIZE, leaking adjacent memory.
Here is a sketch of how someone could weaponize this
struct iovec iovs[2];
// First iovec only 8 bytes, but next 4096 bytes are sensitive
iovs[].iov_base = (void *)user_buffer;
iovs[].iov_len = 8; // But user_buffer[8..4095] is a secret
// Second iovec is normal
iovs[1].iov_base = (void *)other_buffer;
iovs[1].iov_len = 4096;
// Issue SDMA request to hfi1
send_sdma_request(iovs, 2);
Outcome: The kernel will copy 4096 bytes from iovs[].iov_base, leaking 4088 bytes past what you intended. If user_buffer is chosen carefully, this could expose confidential memory or even kernel data.
Multiple fixes for race conditions and reference counting in the mmu_rb_handler memory region cache.
Kernel commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/infiniband/hw/hfi1?id=ef3a86a67e92c5965c00a96b222a956f1c93e3b
Official Patch Description:
- https://lore.kernel.org/all/20240126215553.1177743-1-mike.marcinisyn@cornelisnetworks.com/
Conclusion
CVE-2023-52474 is an excellent example of the dangers of poor length and boundary management in low-level iovec/parsing code. If you administer or deploy Linux on systems with hfi1 and InfiniBand SDMA, apply kernel updates immediately to protect your system. The exploit primitives are simple, the bug is easy to trigger, and the potential for data leakage or use-after-free is severe in shared environments.
References
- Kernel commit with fix
- Discussion on lore.kernel.org
- NVD page for CVE-2023-52474
Summary:
Always double-check your buffer management. When dealing with high-speed data paths and direct memory, these bugs can turn a 1-byte mistake into a full-blown compromise.
*This writeup is original and tuned for easy reading. Please patch your kernels and stay safe!*
Timeline
Published on: 02/26/2024 18:15:07 UTC
Last modified on: 04/17/2024 17:15:54 UTC