CVE-2024-46857 - Linux Kernel Crash in net/mlx5 Bridge Mode with No VFs
Published: June 2024
Exclusive Long Read by [Assistant]
A newly identified bug in the Linux kernel networking subsystem (net/mlx5) impacted users working with Mellanox/ConnectX network cards. The vulnerability, now resolved upstream, could be triggered when users tried enabling bridge mode on devices without any configured Virtual Functions (VFs). This would crash the kernel due to a NULL pointer dereference.
This post explains the vulnerability in simple terms, shows how to reproduce it, demonstrates the risk, and describes the resolution.
## What is net/mlx5 and Bridge Mode?
- net/mlx5 is the kernel networking driver for Mellanox/ConnectX family NICs.
- Bridge mode (e.g., VMware's VEPA/bridge offload) controls how the network card handles packets between virtual network functions.
- Virtual Functions (VFs) are hardware-enabled “virtual NICs” for PCIe SR-IOV, often used for high-performance virtualization.
When you configure a card for virtualization, you typically allocate VFs, and bridge mode can change how those VFs talk to each other or the outside network.
When no VFs are configured (numvfs=), if an admin runs
bridge link set dev eth2 hwmode vepa
… the kernel panics with a NULL pointer dereference, for example
[ 168.967392] BUG: kernel NULL pointer dereference, address: 000000000000003
[ 168.969989] RIP: 001:mlx5_add_flow_rules+x1f/x300 [mlx5_core]
...
[ 168.978620] _mlx5_eswitch_set_vepa_locked+x113/x230 [mlx5_core]
[ 168.979074] mlx5_eswitch_set_vepa+x7f/xa [mlx5_core]
[ 168.979471] rtnl_bridge_setlink+xe9/x1f
...
The key culprit:
esw->fdb_table.legacy.vepa_fdb is NULL when there are no VFs, but the code doesn’t check this before dereferencing.
Exploit Details: Who Could Trigger This?
- Any admin with privileges to manage networking interfaces on affected kernels (i.e., root or via sudo).
Short PoC
# On a system where eth2 is an mlx5 device with VFs...
sudo bridge link set dev eth2 hwmode vepa
# => Kernel may crash!
Why Did This Happen?
Bridge mode operations only make sense when there are VFs, i.e., multiple network entities on the same port.
The kernel code, however, would attempt to set up flow rules for VEPA even when there aren’t any VFs — but the internal table for those rules was uninitialized (NULL), causing a crash at:
if (!esw->fdb_table.legacy.vepa_fdb)
// Oops! This wasn't checked before using the pointer -- leads to NULL dereference
The Fix
The maintainers added logic to disallow setting (or querying) bridge mode when numvfs == . If you try to set it now, the kernel simply returns an error.
Upstream Patch:
net/mlx5: Fix bridge mode operations when there are no VFs
Patch summary
if (mlx5_eswitch_num_vfs(esw) == )
return -EOPNOTSUPP;
No crash on bridge link set ... when there are zero VFs.
- The interface no longer appears in bridge link output in such a case (since there's nothing to configure).
Vulnerable:
Most Linux kernels shipping with mlx5 driver up to and including early 2024 releases _before_ the fix was applied.
Fixed:
Mainline Linux after this commit (6cb3ea8), and all stable series including it.
Severity:
Local DoS — crash if root tries to set bridge mode on unsupported hardware/driver state.
CVE ID:
CVE-2024-46857
Not exploitable for privilege gain, but could be used to intentionally crash critical networking hosts.
References
- Git Patch: net/mlx5: Fix bridge mode operations when there are no VFs
- CVE-2024-46857 – NVD Listing
- Linux Bridge Command Manual
- Linux kernel networking documentation
Don’t attempt bridge mode changes on PFs without any VFs.
3. Audit your automation/scripts for usage of bridge link set ... hwmode to catch potential misconfiguration.
Stay patched!
For more details, track distributions’ security/errata channels for updates specific to CVE-2024-46857.
*Written exclusively for you. If you rely on high-performance Linux networking, keep those kernels updated!*
Timeline
Published on: 09/27/2024 13:15:17 UTC
Last modified on: 10/01/2024 17:10:29 UTC