CVE-2024-26859 - Linux Kernel Race Condition in bnx2x Network Driver – Exploit Details and Patch Analysis

CVE-2024-26859 is a critical race condition vulnerability discovered and resolved in the Linux kernel’s Broadcom NetXtreme II Ethernet (bnx2x) driver. This bug could allow a privileged local user, or certain hardware events, to cause a system crash or potentially open the door for code execution by triggering memory access to already-freed pages during PCIe EEH (Enhanced Error Handling) device resets.

A Simple Explanation

The bnx2x driver runs on servers using Broadcom ethernet cards. When error recovery happens – like a hardware fault or hotplug event – two separate code paths might try to clean up the same memory buffer at the same time. This race can make one operation free a memory page, while the other still thinks it’s valid and tries to use it, leading to a system crash (NULL pointer dereference) or, under certain circumstances, possible exploitation.

During EEH slot reset, the sequence leading to the bug looked like this

1. Transmit Timeout triggers, and the driver’s timeout handler (bnx2x_tx_timeout()) schedules a reset/unload of the network device.
2. EEH framework (for handling PCI errors) independently triggers its own slot_reset routine, also causing a reset/unload.
3. Both routines, running almost simultaneously, attempt to free Scatter-Gather Entries (SGEs) and associated page buffers, but with insufficient safety checks.
4. If Thread A frees a page and Thread B tries to use (or free) it right after, Thread B will reference a "dangling pointer" (memory that’s already returned to the system).

This triggers a kernel crash (NULL pointer dereference), as shown by this error chain

Kernel attempted to read user page () - exploit attempt? (uid: )
BUG: Kernel NULL pointer dereference on read at x00000000
Faulting instruction address: xc0080000025065fc
Call Trace:
[c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+x204/x610 [bnx2x]
[c000000003c67af] [c0000000000518a8] eeh_report_reset+xb8/xf
...

Problem Code Snippet: Where Things Go Wrong

static inline void bnx2x_free_rx_sge(struct bnx2x *bp,
                                     struct bnx2x_fastpath *fp, u16 index)
{
    struct sw_rx_page *sw_buf = &fp->rx_page_ring[index];
    struct page *page = sw_buf->page;
    ...
    // sw_buf->page may be NULL after previous thread freed it
}

The variable sw_buf may already have been set to NULL if another thread already cleaned it up, causing access to invalid memory.

How Could an Attacker Abuse This?

Normally, this bug happens "naturally" due to asynchronous error handling. Privilege escalation is unlikely unless a local attacker can reliably trigger device resets and synchronize thread timings to execute code before and after the dangling pointer free. However, system stability is at risk: an unprivileged user can trigger a chain of resets (eg. via crafted network packets or by intentionally causing device errors), causing the system to reboot or become unavailable (Denial of Service).

Practical POC Exploit: (Crash Trigger)

A user with legitimate or physical access might trigger this by repeatedly resetting the PCIe device and forcing transmit timeouts:

# These commands are for demonstration purposes only
sudo ethtool -t eth1                    # Trigger test mode (simulate errors)
sudo ip link set dev eth1 down
sudo ip link set dev eth1 up
sudo echo 1 > /sys/bus/pci/devices/000:01:00./reset
# while firing up high-throughput traffic to trigger timeouts

In combination or via scripting, this can expose the race and crash the kernel.

The Official Patch – How Was it Fixed?

Solution: The kernel patch checks that each page about to be freed is still valid and owned by the page pool. This prevents double-free bugs and ensures safe device reset handling.
See the kernel patch link:

// Pseudocode simplified
if (unlikely(!sw_buf->page))
    return;   // Already freed by another thread, skip

Check pointer validity before freeing;

- Synchronize access to the page pool between threads/tasks;

References and Resources

- CVE-2024-26859 at NVD
- Upstream Linux commit fixing the bug
- LKML Discussion

Conclusion & Mitigation Steps

Who needs to patch:
If you use the bnx2x ethernet driver (common on Dell/HP servers), upgrade to a Linux kernel with this fix applied or patch your kernel immediately. Unpatched systems can crash or become unusable when devices encounter errors or are reset.

Short-Term Workaround:

Monitor for kernel updates from your Linux distribution.

Long-Term:
Keep your systems up-to-date, especially if running critical infrastructure with Broadcom hardware.

Timeline

Published on: 04/17/2024 11:15:08 UTC
Last modified on: 03/03/2025 17:47:59 UTC