CVE-2021-46931 - Linux Kernel net/mlx5e TX Timeout Bug — Deep Dive & Exploit Details

Linux is widely relied upon for networking due to its performance and hardware offloading capabilities. Mellanox (now NVIDIA) mlx5 drivers are some of the most popular for high-speed Ethernet adapters and Infiniband. However, a vulnerability was found and patched in the mlx5e network driver, tracked as CVE-2021-46931. This bug triggered a kernel panic during transmission (TX) timeout recovery due to an incorrect void * pointer cast. Let's break down the vulnerability, how it could be exploited, the code involved, and the fix — in simple language.

Technical Context

When the Linux kernel detects that a packet transmission (“TX”) was stalled on an mlx5 interface, it schedules recovery using a workqueue (mlx5e_tx_timeout_work). For debugging and recovery, the kernel uses “devlink health reporters” that allow dumping internal queue states.

The function mlx5e_tx_reporter_dump_sq() is supposed to take a pointer to struct mlx5e_txqsq (the send queue), but in the TX timeout recovery path, it's passed a pointer to a different structure: struct mlx5e_tx_timeout_ctx.

As a result, when the dump function tried to read fields from the expected struct, it actually dereferenced an unrelated memory layout. This led to a kernel stack overflow and panic, crashing the whole machine.

Example Kernel Log

mlx5_core 000:08:00.1 enp8sf1: TX timeout detected
mlx5_core 000:08:00.1 enp8sf1: TX timeout on queue: 1, SQ: x11ec, CQ: x146d, SQ Cons: x SQ Prod: x1, usecs since last trans: 21565000
BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66eadc..000000004d932dae)
kernel stack overflow (page fault): 000 [#1] SMP NOPTI
...
Kernel panic - not syncing: Fatal exception

In the buggy kernel source, the key function looked like this (simplified)

// This function is a 'dump' callback for the devlink reporter
static int mlx5e_tx_reporter_dump_sq(struct devlink_fmsg *fmsg, void *ctx)
{
    struct mlx5e_txqsq *sq = ctx; // Casts void* to expected type
    // ... read fields from sq ...
}

But in the TX timeout recovery path, this callback was called with a pointer to a different structure:

struct mlx5e_tx_timeout_ctx {
    struct mlx5e_txqsq *sq;
    // ...other fields...
};

So dereferencing the pointer as a struct mlx5e_txqsq * crashed the kernel.

Exploitability

While this bug requires the user (or application) to either trigger a real hardware TX timeout or maliciously induce one (for example, by flooding TX queues or simulating failure), a local attacker with the right permissions could crash a system running a vulnerable kernel, leading to a denial-of-service.

They force a TX timeout (e.g., via crafted ioctls or massive packet floods).

3. When the recovery workqueue runs, it triggers the miscast, causing a stack overflow and kernel panic.

Note: No privilege escalation or code execution here—just a system crash (DoS).

The Fix

To fix this, developers added a wrapper function that extracts the correct queue pointer from the mlx5e_tx_timeout_ctx structure before calling the original dump function.

Patched Code

// Wrapper added
static int mlx5e_tx_reporter_dump_sq_wrap(struct devlink_fmsg *fmsg, void *ctx)
{
    struct mlx5e_tx_timeout_ctx *timeout_ctx = ctx;
    struct mlx5e_txqsq *sq = timeout_ctx->sq;

    return mlx5e_tx_reporter_dump_sq(fmsg, sq);
}

// When registering dump callback for timeout, use the wrapper
reporter->dump = mlx5e_tx_reporter_dump_sq_wrap;

Now, the dump function always receives the correct pointer type, preventing invalid memory access, stack overflows, and panics.

Full Example: Simplified PoC

While a real-world exploit would require kernel manipulation or repeated link failures, here’s how the call sequence looked, simplified:

// Broken flow (vulnerable)
mlx5e_tx_reporter_dump_sq(fmsg, (void *)&tx_timeout_ctx); // wrong pointer type

// Fixed flow
mlx5e_tx_reporter_dump_sq_wrap(fmsg, (void *)&tx_timeout_ctx);
//    ^ extracts .sq pointer and passes it correctly.

References and Further Reading

- Linux Kernel Mailing List Patch
- CVE-2021-46931 at MITRE
- Official commit in kernel git
- mlx5e Driver code (LXR search)

Summary

- CVE-2021-46931 is a crash bug in the Mellanox mlx5 Linux driver, triggered by a wrong pointer cast in TX timeout error recovery.

Fixed by properly extracting and passing the expected queue pointer.

- All users of affected kernels should upgrade or apply the backported patch, especially on servers with Mellanox/NVIDIA adapters.

If you're running a data center or cluster on Mellanox hardware, be sure to check your kernel version and apply this fix to avoid crashes!


*This post is exclusive and written in plain language for clarity. Please check the official links for up-to-date patches and advisories.*

Timeline

Published on: 02/27/2024 10:15:07 UTC
Last modified on: 04/10/2024 16:31:14 UTC