In early 2024, a new security vulnerability was discovered and patched in the Linux kernel related to NVIDIA’s open-source "nouveau" driver’s device memory (dmem) management. This issue, identified as CVE-2024-26943, concerns memory allocation failures in the function nouveau_dmem_evict_chunk(). If attackers can trigger resource exhaustion, this bug could potentially be turned into a denial of service or even exploited further on vulnerable systems using the nouveau driver.
In this article, we’ll break down what caused this bug, how it got fixed, and how an attacker could approach exploiting it. We provide code snippets, clear explanations, and direct links to references for your own further research.
What is the core issue?
The function nouveau_dmem_evict_chunk() is responsible for evicting pages mapped to device memory chunks. It needed to allocate memory for several arrays using kcalloc(). If the system is running low on physical memory, kcalloc() can return NULL. However, the function didn’t check for this, and if it tried to use these NULL pointers (e.g., by dereferencing them), a kernel bug (like a crash or panic) could result.
You can read the patch and original kernel discussion here
- commit 24d65883c, linux kernel
- CVE Page on NVD
Patch summary
- If allocation failed, the code now passes __GFP_NOFAIL makes the allocation *never* return NULL, though this is risky and should be used cautiously.
- Changed kcalloc() to kvcalloc(), allowing non-contiguous physical memory allocations (making it less likely to fail).
In simplified terms, here’s what was happening before the patch
src_pfns = kcalloc(npages, sizeof(unsigned long), GFP_KERNEL);
dst_pfns = kcalloc(npages, sizeof(unsigned long), GFP_KERNEL);
dma_addrs = kcalloc(npages, sizeof(dma_addr_t), GFP_KERNEL);
/* ... code assumes these are valid pointers ... */
for (i = ; i < npages; i++) {
do_something(src_pfns[i]); // if src_pfns is NULL, boom!
}
If any of these allocations failed (i.e., returned NULL), the next dereference would trigger a crash.
Switch to kvcalloc() instead of kcalloc():
- kvcalloc() isn’t limited by the need for physically contiguous memory, which reduces the chance of allocation failure.
Force allocation to not fail:
- By passing the __GFP_NOFAIL flag, allocation will keep retrying until it succeeds. Usage of __GFP_NOFAIL is rare and supposed to be limited to places where failure absolutely cannot be handled gracefully.
Here’s the safe, patched version
src_pfns = kvcalloc(npages, sizeof(unsigned long), GFP_KERNEL | __GFP_NOFAIL);
dst_pfns = kvcalloc(npages, sizeof(unsigned long), GFP_KERNEL | __GFP_NOFAIL);
dma_addrs = kvcalloc(npages, sizeof(dma_addr_t), GFP_KERNEL | __GFP_NOFAIL);
// Now src_pfns, dst_pfns, and dma_addrs are guaranteed non-NULL
Example Exploit Approach
You might be wondering: Could an attacker exploit this?
While direct code execution is unlikely, a denial-of-service (DoS) is feasible. Here’s one way to approach an exploit:
1. Exhaust Physical RAM
If an attacker runs a process that consumes most of the system’s RAM, then triggers GPU memory operations that call nouveau_dmem_evict_chunk() (such as heavy compute or graphics tasks), kcalloc() is forced to fail.
2. Trigger the Bug
By repeatedly requesting allocations, eventually the condition where kcalloc() returns NULL is reached, and the driver dereferences it:
3. Minimal PoC (Conceptual)
# This is a conceptual PoC: Run in userspace, fills RAM, then spams GPU ops
import numpy as np
bigmem = []
try:
while True:
bigmem.append(np.zeros(10000000)) # Fill memory!
except:
pass
# Now run a GPU load, for example: keep replaying a large CUDA workload or OpenCL op
# That would trigger nouveau_dmem_evict_chunk() allocations
Alternatively, any graphics application using VAAPI, CUDA, or other libraries that allocate lots of device memory could indirectly trigger this under low memory.
Security Impact and Scope
- Systems affected: Linux distributions using the *nouveau* kernel module, with users who use NVIDIA graphics cards in a way that triggers device memory operations (gaming, computation, certain desktop environments, etc.).
- What an attacker could do: DoS condition — crashing the system or forcing a reboot by exhausting memory and triggering the driver bug.
Recommendations
- Patch your kernels! Most distributions pushed this fix to supported kernels. If you use NVIDIA cards with *nouveau*, update.
- Monitor allocation failures: If you build custom kernels, remember that memory handling bugs are critical in drivers.
- Audit for similar bugs: The pattern of unchecked allocation is common — look for kcalloc() and similar in other code.
Conclusion
CVE-2024-26943 shows how a simple unchecked allocation can lead to system crashes in a major Linux graphics driver. The fix ensures safer memory allocation and guards against accidental system outages. For readers interested in Linux kernel security and driver hardening, this issue is a textbook case of why careful error checking matters.
For more technical details, check out these resources
- Kernel Patch Commit
- Nouveau Driver Docs
- CVE-2024-26943 at NVD
Stay safe, and keep your Linux systems up to date 🚀.
Timeline
Published on: 05/01/2024 06:15:09 UTC
Last modified on: 03/03/2025 17:47:59 UTC