A subtle but critical bug was recently fixed in the Linux kernel’s mlx5 network driver code, specifically affecting Link Aggregation Group (LAG) port selection structures. Identified as CVE-2025-21675, this vulnerability could lead to a kernel crash (NULL pointer dereference) on systems using Mellanox (NVIDIA) MLX5 series NICs for advanced networking.

References to the original sources

> TL;DR: The bug allows a program or attacker to crash the kernel if they can trigger race-conditions or errors in MLX5 LAG initialization, potentially leading to denial of service (DoS).

1. Background: What's Going On?

Many modern data centers use Mellanox NICs for high-speed networking. For resilience and throughput, multiple physical connections are often bundled using LAG (Link Aggregation Group).

In the MLX5 Linux driver, when the kernel needs to (re)create LAG port selector structures, it does so via functions like mlx5_lag_port_sel_create(). If a failure occurs partway through this setup, the driver tries to clean up (destroy) all previously set up "definers"—structures mapping traffic types (tt) to hardware rules.

The problem arises if memory or logic errors leave behind a "stale" pointer: something is marked as allocated when it actually got destroyed, and the next cleanup tries to destroy it again ("double free"/destroy). That results in a kernel NULL pointer dereference – an instant kernel panic.

Here’s a simplified version of the actual failure flow

mlx5_lag_port_sel_create()
   → mlx5_lag_create_definers()        // allocate definers for traffic types
      → mlx5_lag_create_definer()      // create one definer
         (fails partway, e.g. tt=1)
         → mlx5_lag_destroy_definers() // destroy all previous definers

// But if struct fields (definers) are not cleared after destroy,
// another attempt will try to destroy them again, leading to crash.

Example Error

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
...
[Relevant stacktrace, showing double destruction in mlx5_core]
...
Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f940040)

See the bug tracker's full dump for more details.

Faulty Code Conceptually

// Driver code (pseudo C)
int mlx5_lag_create_definers(..., struct mlx5_lag *lag) {
    for (int tt = ; tt < N; ++tt) {
        lag->definers[tt] = alloc_definer();
        if (!lag->definers[tt]) {
            mlx5_lag_destroy_definers(lag); // Destroy all
            return -ENOMEM;
        }
    }
}

// However, after destroying, lag->definers[tt] wasn't cleared to NULL!

The Fix

The patch ensures that after destruction, pointers are nulled so double-destroy can't happen.

void mlx5_lag_destroy_definers(struct mlx5_lag *lag) {
    for (int tt = ; tt < N; ++tt) {
        if (lag->definers[tt]) {
            destroy_definer(lag->definers[tt]);
            lag->definers[tt] = NULL; // <-- THE FIX
        }
    }
}

See the patch

Threat Model

- Attacker: Needs to be able to trigger device reconfiguration. In cloud/data center, this may be tenants with sufficient privileges.

Impact: Denial of Service (DoS) — kernel panic, system reboot required.

- Scope: Only affects systems with MLX5 hardware (NVIDIA/Mellanox), and kernel versions before the patch.

Example Scenario

A privileged (root or near-root) user, or a buggy system service, triggers rapid up/down or LAG reconfigurations while causing allocation errors. For instance, by deliberately filling up the driver's memory pools or injecting fault conditions (e.g., via fault injection frameworks), they provoke the error path.

Sketch of an Exploit

# As root: repeatedly reconfigure LAG (bond) interfaces to stress the driver
while true; do
    ip link del bond 2>/dev/null
    ip link add bond type bond mode 802.3ad
    ip link set eth master bond
    ip link set eth1 master bond
    # optional: stress memory or limit kernel allocs
done

If this triggers a mid-setup failure, the buggy destroy logic will try to destroy already-freed pointers, causing a kernel crash.

> Note: This is a logical bug; it doesn't result in privilege escalation, but an unprivileged local user cannot usually trigger it. Root-equivalent permission needed.

The patch changes the error cleanup logic: after destroying definers, their pointers are cleared.

- The kernel now correctly resets port select structures if an error occurred during setup, preventing stale/double-free pointers.

Fix present in

- Mainline Linux kernel after this patch
- Upcoming stable releases (see stable mailing list)

Am I affected?

- Are you running Mellanox/NVIDIA MLX5 hardware?

Upgrade your kernel to the latest version with the patch.

- Limit unprivileged access to networking/driver configuration (standard best practice in production).

7. Further References

- Patch to Linux kernel
- Original bug report / trace
- Kernel LAG/MLX5 Documentation
- CVE record (NVD) *(pending publish)*

Conclusion

CVE-2025-21675 is a classic example of how resource cleanup mistakes can have severe impacts, even in mature code. Fortunately, the bug “only” allows crashes (Denial of Service), not code execution or privilege escalation, and only affects systems using specific NICs.

The fix is simple: clean up pointers after freeing. To stay safe, always run current kernels, and restrict device reconfiguration powers to trusted admin users.

> If you run a data center with Mellanox gear, patch ASAP — this is a stability fix, not just a security fix!


*Feel free to share or cite. For feedback or corrections, contact [your kernel vendor](mailto:security@kernel.org) or your OS support channel.*

Timeline

Published on: 01/31/2025 12:15:28 UTC
Last modified on: 02/04/2025 15:30:22 UTC