A subtle but significant flaw was found and fixed in the Linux kernel’s mlx5e driver for Mellanox (NVIDIA) network cards (as used in data centers and high-performance computing). The vulnerability, CVE-2023-52487, could let a local attacker crash the system or trigger warnings due to a race condition while managing offloaded flow rules, opening up a stability risk on affected systems.

This article explains this bug in plain language, demonstrates trigger conditions, shows you the actual code problem, and links to trusted references.

mlx5e: It’s the main Ethernet driver for Mellanox network cards using the MLX5 chip.

- Peer flows: These are networking rules shared across multiple instances for hardware offloading (for high performance in things like Open vSwitch).
- The driver's flow offloading makes high-speed SDN and networking possible but means bugs here can instantly affect whole servers.

What Happened?

- When deleting so-called “peer flows,” the kernel code was refactored to only clear a flag (DUP) if the list of these peer flows was empty.
- But if another kernel task (like neighbor update workqueue) is using a flow at the same time, the flow’s reference count won’t hit zero, so the flow remains on the list and the flag stays set.
- Immediately after, the deletion function still tries to delete the flow from every possible peer index (not just initialized ones). If it tries to remove from a never-initialized entry, you get a NULL pointer dereference or, if debug list support is enabled, a list corruption warning/panic.

In Short

> Delete a flow multiple times, at high concurrency, and you can crash the Linux kernel or trigger warnings.

Here’s an excerpt of what happens

list_del corruption, ffff888139110698->next is NULL
WARNING: CPU: 2 PID: 22109 at lib/list_debug.c:53 __list_del_entry_valid_or_report+x4f/xc
...
mlx5e_tc_del_fdb_peers_flow+xcf/x240 [mlx5_core]
...

The Vulnerable Code

In practice, the buggy design is in how the function mlx5e_tc_del_fdb_peers_flow() and its subroutine mlx5e_tc_del_fdb_peer_flow() handle reference counting and list deletions.

Simplified Example (NOT actual driver code)

void mlx5e_tc_del_fdb_peers_flow(struct flow *f)
{
    for (int i = ; i < MAX_PEERS; i++) {
        struct flow *peer = f->peers[i];
        if (peer)
            mlx5e_tc_del_fdb_peer_flow(peer, i);
    }
}

The buggy mlx5e_tc_del_fdb_peer_flow() only tried to remove from the list if it released the last reference. If any peer was still around, some list pointers remained, and subsequent iterations could try to delete something at an uninitialized slot — causing a kernel oops.

The Patch

Patch summary:
Always remove the peer flow from the list, even if not releasing the last reference.

Patch example

- if (last_ref)
-     list_del(&flow->peer_list);
+ list_del(&flow->peer_list); // Always

Upstream Git fix:
- net/mlx5e: Fix peer flow lists handling (patch)

Below is a conceptual PoC for admins or researchers (use in lab only!)

#!/bin/bash
# Pre-condition: Open vSwitch w/ hardware offloading and Mellanox cards

for i in {1..100}; do
    # Add a flow that would get offloaded (replace with your own match/action)
    ovs-ofctl add-flow br "priority=100,in_port=1,actions=output:2"
    # Remove it right away, many times, in parallel with background flows
    ovs-ofctl del-flows br "priority=100,in_port=1"
done &
# Meanwhile, in another shell:
ip neighbor flush dev eth1  # This can kick off neighbor update workqueue

Expected outcome:
With enough concurrency and luck, a system panic may occur (see dmesg for the stack above).

Risk: Local DoS (system crash). Can bring down production network nodes!

- Fixed in: Kernel patch

Original Patch:

net/mlx5e: Fix peer flow lists handling

CVE Entry:

CVE-2023-52487 at MITRE
- Bugzilla/Fedora tracking:
Red Hat Bugzilla #2263246

List Debug Info:

Linux Kernel list debugging

Conclusion

CVE-2023-52487 is a real-world example of how complex networking code, especially around hardware offloading and concurrency, can trip up even top-tier kernel developers. If you run data center Linux with Mellanox and OVS, patch now, or your nodes could be vulnerable to local or orchestrated DoS attacks.

Have a safe kernel hacking! For further reading, check out kernel changelog discussions.


*This article is exclusive to this thread and may be used for educational advisories or quick administrator awareness briefings.*

Timeline

Published on: 03/11/2024 18:15:16 UTC
Last modified on: 12/12/2024 17:31:37 UTC