In early 2024, the Linux kernel development community found and patched a critical bug in the Mellanox Spectrum family’s switch driver, specifically in the ACL (Access Control List) TCAM region management code. If you’re a Linux administrator running Mellanox switches, or a kernel developer interested in networking stack internals, this vulnerability—tracked as CVE-2024-26595—is worth knowing inside out.
Below, we’ll break down what happened, why it mattered, and how the kernel was fixed, all in everyday American English. We’ll also look at relevant code snippets, explore how this bug could have been triggered, and show where to find the official patches and references.
What Is CVE-2024-26595?
CVE-2024-26595 is a vulnerability in the Linux kernel's Mellanox (mlxsw) switch driver. When certain errors occurred during Access Control List (ACL) hardware setup, the kernel would try to clean up some resources. But there was a bug: it would call a destroy function with a null pointer, causing a kernel crash.
In simple terms:
- If something failed while creating an ACL "region" for Spectrum switches, the kernel code would sometimes try to access a part of that region (specifically, region->group->tcam)—but if the region wasn't set up right, that pointer would be NULL.
The Technical Details
The problem lived in drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c. During configuration, when an error interrupted the process of attaching an ACL region to its group, the cleanup code tried to destroy that region. But the destroy function assumed the group and tcam pointers were already valid, which wasn’t always true.
What the Crash Looked Like
Here’s a taste of the crash from the bug report (source):
BUG: kernel NULL pointer dereference, address: 000000000000000
RIP: 001:mlxsw_sp_acl_tcam_region_destroy+xa/xd
Call Trace:
mlxsw_sp_acl_tcam_vchunk_get+x88b/xa20
mlxsw_sp_acl_tcam_ventry_add+x25/xe
mlxsw_sp_acl_rule_add+x47/x240
mlxsw_sp_flower_replace+x1a9/x1d
...
The Faulty Path (Before Patch)
Let’s look at the error-handling path that caused the trouble. Here’s a super simplified snippet (not actual kernel code):
// In the error path
mlxsw_sp_acl_tcam_region_destroy(region); // region->group->tcam might be NULL!
Inside mlxsw_sp_acl_tcam_region_destroy(), the code used region->group->tcam without checking if it was valid. If the region wasn't properly attached to a group, this was still NULL. Boom: kernel panic.
The Fix
The solution? Instead of assuming the tcam pointer is always available via region->group->tcam, the fix retrieves the correct pointer via the helper function mlxsw_sp_acl_to_tcam(), which is safe even in error paths.
Patched code
struct mlxsw_sp_acl_tcam *tcam = mlxsw_sp_acl_to_tcam(region->acl);
if (tcam) {
// safe cleanup
}
With this change, even if the region’s group pointer setup failed, the code no longer dereferenced a NULL pointer during cleanup. Kernel crash—*averted*.
Can This Be Exploited?
This isn’t a classic remote code execution bug, but it can be abused to trigger a kernel panic (DoS) under certain conditions. Imagine an attacker (or even a buggy root user-space tool) coaxing the kernel into the error path from user-space triggers like tc or rtnetlink (tools which configure traffic control and shaping):
# In some environments, manipulating tc/fl filter setup on the affected device could trigger the faulty cleanup.
tc filter add dev eth ... # (example; specifics depend on setup)
If you’re running services that let untrusted users control advanced network configuration, this bug is especially dangerous. At best, you’ll crash the kernel and take out the network; at worst, attackers repeatedly trigger reboots.
The official patch can be viewed here
- Git commit: mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
Linux distro maintainers immediately moved to backport and apply the patch, so updating your kernel should keep you safe.
References
- Upstream commit
- Official kernel CVE page
- Linux Kernel Bugzilla Thread
Takeaways
- Check that your kernel includes the fix for CVE-2024-26595, especially if you use Mellanox Spectrum series hardware.
Even tiny pointer errors can snowball into critical bugs in the kernel.
Stay patched, and keep an eye on those error paths! If you want to geek out over the bug or see if your kernel is up to date, visit one of the above references or run:
uname -a
# and check your distro's patch notes for CVE-2024-26595
> *This post is written exclusively for deeper understanding and everyday clarity. For any kernel bug, always consult official documentation before making network changes.*
Timeline
Published on: 02/23/2024 15:15:09 UTC
Last modified on: 04/17/2024 19:55:31 UTC