A critical stability bug was recently patched in the Linux kernel’s AMD display stack, specifically within the Direct Rendering Manager (DRM) code. The issue, tracked as CVE-2024-43904, affects the function handling idle power optimizations for AMD GPUs. Without this fix, certain pointer variables (stream and plane) could be NULL—but were used later without any safety check. This programming oversight could crash the kernel, resulting in a denial of service.
This article explains the vulnerability, demonstrates the risky behavior, shows you the fix as submitted upstream, and outlines how to check if your system is affected. While not easily exploitable for privilege escalation, this bug could crash graphical systems or servers using AMD GPUs. All kernel users running AMD GPUs are encouraged to update to a patched version.
The Function in Question
The culprit is the function dcn30_apply_idle_power_optimizations inside the file dcn30_hwseq.c.
Within this function, two pointers—stream and plane—were assumed to be possibly NULL at one point in the code (like line 922), but were later used directly without further checking (lines 938 and 940), inviting a:
Here’s a simplified look at the problem
// Imagine at line 922: we check for NULL...
if (!stream || !plane)
return;
// ...but later, close to lines 938-940
do_something_with(stream->foo);
do_something_with(plane->bar);
// If stream or plane is NULL, BAD THINGS HAPPEN here!
While the code did a preliminary check, the code path could potentially allow those pointers to become null (due to logic errors or odd states) before being dereferenced. Robust code should check them right when used.
To address this, the following check was added before dereferencing stream and plane
// New, safer code
if (!stream || !plane)
return;
do_something_with(stream->foo);
do_something_with(plane->bar);
Here’s an excerpt from the actual commit
+ if (!stream || !plane)
+ return;
This ensures that the function exits gracefully if stream or plane is null, rather than crashing the system.
Can This Be Exploited?
While there isn’t a practical remote code execution exploit (at least from public research as of now), this bug could be triggered locally under specific (possibly rare) race conditions or user actions—primarily via complex display configuration changes, hotplugs, or corrupted state managed by DRM in AMD GPUs.
Possible attack surface includes
- Malicious programs interacting with /dev/dri or X11/Wayland display APIs
- Attackers purposefully manipulating display state or feeding malformed input to the GPU subsystem (rare, but possible for local users)
Proof-Of-Concept (for demonstration)
A minimalistic PoC isn't readily available due to the complexity of GPU driver state, but in principle, if you remove the checks in the buggy function and arrange for the display stack to pass a null stream or plane, you will crash the system. This can be simulated if you are developing kernel modules and can trigger display reconfiguration in odd states.
Simple C Kernel Module Sketch
// This is a non-working illustration (do NOT run in production!)
struct dc_stream_state *stream = NULL;
struct dc_plane_state *plane = NULL;
// kernel panic: dereferences NULL
printk("Steam format: %d\n", stream->format);
printk("Plane type: %d\n", plane->type);
Try this in kernel context with NULL pointers and the kernel will panic.
Linux Distribution Security Notices:
- Red Hat Bugzilla
- Debian Security Advisories
- Arch Linux CVE page
Manual Patch Inspection:
- Look for the null check in your /usr/src/linux/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c
Update to the latest kernel available for your distribution.
- If you build your own kernel, apply the official patch from upstream (commit diff here).
References
- Upstream Kernel Patch Commit *(update with actual commit if available)*
- CVE-2024-43904 on MITRE
- Linux AMD Display Driver (drm/amd/display) Source Code
- Linux Kernel Security Mailing List LKML
In Summary
CVE-2024-43904 is a null pointer dereference vulnerability in the Linux kernel’s AMD graphics driver, fixed by simply checking for NULL pointers before use. It shows how a tiny omission can have system-wide impact in critical code, and why rigorous static code analysis tools and routine code review are so important for kernel development. If you use AMD graphics hardware on Linux, updating your kernel is the best way to keep your system secure and stable.
*Stay tuned for further updates, and always keep your Linux system patched!*
Timeline
Published on: 08/26/2024 11:15:04 UTC
Last modified on: 08/27/2024 13:40:50 UTC