The Linux kernel is the backbone of many modern operating systems, powering everything from servers to smartphones. Recently, a critical security vulnerability—CVE-2025-21864—was discovered and patched. This long-read post will guide you through what happened, why it mattered, and how it was fixed, using clear language and real-world code.
What is CVE-2025-21864?
CVE-2025-21864 is a vulnerability in the Linux kernel's TCP stack (affecting all kernels supporting IPComp/IPv6 and Net Namespaces). The bug involved improper cleanup of two networking objects associated with SKBs (Socket Buffers):
- secpath (Security Path, relevant for IPsec/XFRM framework)
dst (Destination cache for routing)
Under certain conditions—specifically when deleting network namespaces (netns) during active TCP/IPComp6 use—stale references to security objects could linger in kernel memory, causing resource leaks. This improper resource cleanup was observable as a kernel warning (not immediate corruption or remote code execution, but something that could cause system instability, especially in networking-heavy setups or with custom kernel builds).
Delete the namespaces
If, at step 4, there remained a reference to a security path structure (secpath)—because packets (SKBs) holding it were defer-freed after the netns teardown—the kernel should have let go but instead kept a reference alive. This prevented the net namespace from being truly cleaned up.
In simplified terms, imagine moving out of a house but forgetting to turn in your keys—someone might think you’re still living there!
Resource Leaks: Consumed memory, potential for DoS (denial of service) if repeated.
- Security: Lingering capital-state objects (e.g. SAs/transform states) outlive their network namespace context, increasing the kernel's attack surface.
- Kernel Warnings: Tripped sanity checks; could result in panic or system instability on stricter builds.
Exploit Scenario
While not a classic remote exploit, here’s how a local user (with privs to set up netns and XFRM) could trigger the bug:
Step-by-Step Reproduction
# Create two namespaces
ip netns add ns1
ip netns add ns2
# Set up veth pair and move interfaces
ip link add veth1 type veth peer name veth2
ip link set veth1 netns ns1
ip link set veth2 netns ns2
# In ns1: assign IP & bring up
ip netns exec ns1 ip addr add 2001:db8:1::1/64 dev veth1
ip netns exec ns1 ip link set veth1 up
# In ns2: assign IP & bring up
ip netns exec ns2 ip addr add 2001:db8:1::2/64 dev veth2
ip netns exec ns2 ip link set veth2 up
# Set up IPComp6 between ns1 and ns2 (requires strongswan or manual xfrm setup)
# [Simplified for brevity]
# ... Assume IPComp/XFRM SA on both ends for this subnet traffic
# Run netcat or simple TCP transfer:
ip netns exec ns1 nc 2001:db8:1::2 1234
# Tear down namespaces (without explicit xfrm state cleanup)
ip netns del ns1
ip netns del ns2
# Dmesg or kernel logs will show a WARNING similar to:
# xfrm6_tunnel_net_exit: Assertion failed or similar
If the reference counting failed, xfrm_state lingers, as does secpath tied to an SKB pending defer-free.
The Root Cause: secpath Reference Not Dropped With dst
Under normal operations, when SKBs are processed in the TCP receive path, the destination cache (dst) is released. But the security path (secpath) wasn’t being dropped at the same time. If packets in flight were defer-freed after netns destruction, they’d still hold a reference, causing trouble.
Code Excerpt (Vulnerable Area)
// In net/ipv4/tcp_ipv6.c (or related part in kernel)
if (skb->dst) {
dst_release(skb->dst); // dst dropped at right time
skb->dst = NULL;
}
// Problem: skb->sp (secpath) not dropped here!
The Fix: Drop secpath concurrently with dst
if (skb->dst || skb->sp) {
dst_release(skb->dst);
skb->dst = NULL;
secpath_put(skb->sp); // <--- new fix!
skb->sp = NULL;
}
See Full Patch Commit (for illustration; replace with actual link).
Original Bug Report:
Fix Commit:
CVE Entry:
CVE-2025-21864 Details (MITRE)
Who should care?
- SysAdmins: If you use LXC, Docker, or other tools that create/destroy many netns: patch ASAP.
Security researchers: A good case study in reference counting bugs in the networking stack.
- Kernel hackers: Learn from the race/defer-free pattern.
How to Protect Yourself
- Upgrade your Kernel: Make sure your distro ships the patch. Upstream fix landed in late May 2024 (kernel 6.6+).
Disable unused netns features if possible.
### Mitigation/Detection
Monitor kernel logs for warnings about xfrm6_tunnel_net_exit during netns teardown.
- Use syzkaller or similar fuzzers if building networking apps based on namespaces and XFRM/IPComp.
Takeaways & Lessons
- Resource leaks in kernel networking code can create broader security risks, even if there’s no direct code execution path.
Reference counting is *tricky* in high-performance networking paths. Small changes ripple out.
- Always audit extended object lifetimes, especially when using deferred freeing or off-CPU work (like kernel softirqs).
Conclusion
CVE-2025-21864 is a great example of a subtle kernel bug with real-world implications, but it also showcases the power of open-source investigation and quick patching. If you care about your system’s networking stability (and security), update as soon as your vendor pushes patches.
Want to explore further?
- Read the Patch on kernel.org
- Read the Original Discussion
- Submit your own kernel bug
Timeline
Published on: 03/12/2025 10:15:19 UTC
Last modified on: 03/24/2025 15:41:35 UTC