CVE-2024-26921 - Preventing Use-After-Free in Linux Kernel Inet Defrag Code

Linux networking is complex and powerful, allowing high performance, flexible filtering, and encapsulation. However, complexity also increases the risk of subtle bugs, especially in areas like packet fragment reassembly.

CVE-2024-26921 was discovered in the Linux kernel's inet_defrag code. This bug could cause a use-after-free scenario due to premature release of socket references (sk), leading to possible crashes or unpredictable behavior in user networking applications, container orchestrators, or SDN frameworks like Open vSwitch.

Note: This article is written in simple language and is unique to this release. We avoid jargon and do not copy from other publications.

What Is CVE-2024-26921?

Summary:
When the Linux kernel reassembles fragmented packets (for instance, in Netfilter or Open vSwitch paths used for firewalling and switching), it might release the associated socket (sk) too early — while kernel code still depends on it being alive. If the socket is freed and reused elsewhere, the kernel might end up accessing a freed pointer — a classic use-after-free bug.

This is especially risky when packets are processed in the output path (i.e., when sending), where skb->sk (the socket reference in the socket buffer) is expected to remain valid across code paths that may involve fragment reassembly.

A packet is sent by an application.

- In the kernel, it traverses the output path and some Netfilter/conntrack hooks.

Bug Mechanics:

- During reassembly (e.g., in ip_defrag()), the kernel previously called skb_orphan(), which removes the socket reference from the packet (and releases it if nothing else holds it).
- If the reassembly is completed _while still in a function that expects skb->sk to exist_, further code might access a pointer to already-freed memory.

Real-World Effects:

- Panic/crash due to bad memory access.
- Potential for lesser bugs, misaccounted memory, or in theory privilege escalation/attack surface (though no such exploits are known as of June 2024).

Prior to the fix, the ip_defrag() code would orphan the skb too soon

// net/ipv4/ip_fragment.c (simplified!)
struct sk_buff *ip_defrag(...)
{
    ...

    if (/* need to orphan? */)
        skb_orphan(skb); // <--- this could release sk too soon!

    ...
}

That could result in code like this

int my_network_fn(struct sk_buff *skb)
{
    // We are passed skb and expect skb->sk to be a valid socket pointer
    process_skb(skb);
    // ... later code accesses skb->sk ...
}

If skb_orphan() was called inside ip_defrag(), and that happens while we still need skb->sk, the pointer could be freed and reused elsewhere. Anything that touches skb->sk afterward is unsafe.

The Fix: Delay Orphaning Until Safe

Kernel developers (notably Eric Dumazet) recognized the core problem: you can't release the socket reference until you _know for sure_ it's safe to do so.

Delay skb_orphan: The socket is not orphaned until it is absolutely necessary.

2. Track Ownership Properly: By stashing the socket pointer in the fragment queue (not the skb struct itself), but ensuring bookkeeping for memory is correct even after reassembly or fragmentation.
3. Align Reference Counting: Ensure that memory and socket reference counting are always accurate, especially if the packet is later fragmented again.

Important note: The offset value that was formerly stored in the skb->sk member (aliased as ip_defrag_offset) is now moved into a proper fragment control block (FRAG_CB) structure, avoiding clobbering the socket pointer.

Corrected Code (Simplified and Commented)

// net/ipv4/ip_fragment.c (simplified for clarity)
struct sk_buff *ip_defrag(...)
{
    ...

    // Only orphan if fragment list is used (safe point)
    if (should_orphan_now)
        skb_orphan(skb);

    // If skb matches reassembled queue, steal reference safely
    if (is_last_fragment) {
        // Ensure sk is attached as needed
        reattach_sk(head_skb, skb->sk);
        fixup_wmem_accounting();
    }

    ...
}

Exploit Details

This bug is not a classic privilege escalation or easy RCE, but it does expose a use-after-free risk that could theoretically be abused by kernel attackers:

Trigger:

- Send a sequence of crafted fragmented packets over a socket, while manipulating the output path (e.g., using Netfilter, OVS, or similar).

Arrange things so the packet will be reassembled while it is still needed downstream.

- Exploit comes from controlling the reuse of the freed sk structure before the kernel code path has finished using it.

No public, reliable exploit as of June 2024.

- Possible outcomes: Local DoS (kernel panic), possibly privilege escalation if attacker controls or predicts the freed memory.

Risk:

- More relevant in containerized or virtual environments with lots of complex network plumbing (e.g., Kubernetes with CNI, Open vSwitch, etc).
- Reduced risk if only basic IPv4 is used without advanced hooks, but hard to guarantee in modern setups.

PoC (Conceptual, Not Real Full Exploit)

import socket

# This is just an illustrative fragment sender
def send_fragments():
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 65536)
    # Build fragmented packet here...
    data1 = b'A' * 140
    data2 = b'B' * 140
    s.sendto(data1, ('192.168.1.1', 1234))  # Send first fragment
    s.sendto(data2, ('192.168.1.1', 1234))  # Send second fragment

# Would need netfilter/OVS/conntrack hooks on tx path to trigger bug

But in practice, reliably controlling the window for use-after-free is tricky outside of lab setups.

References

- CVE-2024-26921 at NVD (National Vulnerability Database)
- Linux kernel patch/commit fixing this bug
- Linux kernel mailing list: Eric Dumazet's analysis
- Earlier Linux patch 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")

Who Is Affected

- All Linux kernel users between kernels that introduced this logic and before the fix is merged/patched.

Open vSwitch (OVS)

- Conntrack acceleration on the transmit/output path

How to Protect Yourself

- Update your Linux kernel! If you're running anything remotely recent (5.x or 6.x), check with your distro for security updates.
- If you must run an older kernel, consider disabling or limiting user-created Netfilter/OVS rules until patches are applied (though that effectively weakens your security model in other ways).

Conclusion

CVE-2024-26921 serves as a reminder of how critical kernel reference handling can be. This time, a subtle shift in when the kernel's networking stack "lets go" of a socket reference made the difference between safe code and a potentially exploitable vulnerability. If you operate any system that hosts containers, network namespaces, or does software switching/bridging, patch now!

Have more questions? See the above references or reach out to your distribution's security channels.

Timeline

Published on: 04/18/2024 10:15:07 UTC
Last modified on: 05/04/2025 08:59:45 UTC