A new vulnerability, CVE-2025-21683, was identified and recently fixed in the Linux kernel's Berkeley Packet Filter (BPF) implementation. This bug involved a memory leak in the bpf_sk_select_reuseport() function. The leak happened when handling TCP sockets with BPF-based reuseport logic, specifically when an established socket had previously used the SO_ATTACH_REUSEPORT_EBPF option. This post breaks down the bug, its impact, relevant code, and the fix, using simple, exclusive language aimed at developers and sysadmins.
What Is Reuseport and Why Does It Matter?
The reuseport feature lets multiple sockets listen on the same IP/port, and a BPF program can decide which one gets new incoming connections. Modern load-balancing and scalable server applications depend heavily on this mechanism.
If this logic is flawed—especially in the kernel where it handles sockets' lifecycles—a resource leak (like a memory leak) can crash systems or exhaust resources, risking denial-of-service.
Where the Leak Occurred
If a socket was established _after_ having the SO_ATTACH_REUSEPORT_EBPF set, the kernel's logic did not always handle its lifecycle correctly. It could wrongly assume there was still an outstanding reference for memory management, causing the kernel to "leak" memory each time this happened.
From the reproducers and crash logs
unreferenced object xffff888101911800 (size 2048):
comm "test_progs", pid 44109, jiffies 4297131437
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
80 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 9336483b):
__kmalloc_noprof+x3bf/x560
__reuseport_alloc+x1d/x40
reuseport_alloc+xca/x150
reuseport_attach_prog+x87/x140
sk_reuseport_attach_bpf+xc8/x100
sk_setsockopt+x1181/x199
do_sock_setsockopt+x12b/x160
__sys_setsockopt+x7b/xc
__x64_sys_setsockopt+x1b/x30
do_syscall_64+x93/x180
entry_SYSCALL_64_after_hwframe+x76/x7e
This stack shows the memory allocation chain for socket reuseport BPF attachments, surfacing during test programs.
Vulnerable Logic (Before Fix)
The problematic code in the kernel's BPF reuseport selection looked like this (simplified for clarity):
struct sock *bpf_sk_select_reuseport(...)
{
struct sock *sk = ...; // Looked up from the sockmap
if (!sk)
return ERR_PTR(-ENOENT);
// Incorrect: Assumes sk_reuseport_cb != NULL means refcounted
if (sk->sk_reuseport_cb) {
// ... do something
}
return sk;
}
On certain error paths, references weren’t properly dropped for some established sockets. Over time, this leaks memory.
The Patch: How It Was Fixed
The key was to explicitly drop the socket’s reference on every failing path, not just when the callback is missing.
Here’s a snippet illustrating the patch
// In bpf_sk_select_reuseport():
if (error) {
// Always drop reference, regardless of sk_reuseport_cb state
sock_put(sk);
return ERR_PTR(error_code);
}
By calling sock_put() unconditionally on both error paths, the kernel ensures all memory (and reference counters) are handled properly.
Reference:
- Upstream Linux commit fixing CVE-2025-21683
Reproducing and Exploiting
While this wasn’t a remote code execution issue, it could be exploited to exhaust kernel memory:
Set SO_ATTACH_REUSEPORT_EBPF on them.
3. Trigger establishment/reset cycles without proper closure.
4. Repeat: Over time, power-users or malicious local processes could exhaust the server’s RAM, triggering OOM killer or system crashes.
Simple Exploit Script (PoC)
import socket
for _ in range(100000):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, 70, b"\x00"*8) # Dummy eBPF prog
# Do NOT close the socket, repeat
*Note: The actual option and usage above is schematic—proof-of-concept code would require specific eBPF program loading and root permissions.*
All Linux versions prior to the upstream fix (check your vendor’s security patches).
- Any system using BPF-based reuseport features, such as NGINX, Envoy, high-scale web servers, or load-balancing applications relying on cluster socket fans.
References
- Linux kernel patch for CVE-2025-21683
- Linux reuseport and eBPF documentation
- CVE record (MITRE, once public)
- Sockmap BPF API Reference
Summary
CVE-2025-21683 was a tricky memory leak in Linux’s BPF-based socket reuseport. Fixing it protects server resources against local memory exhaustion attacks. The key lesson: always manage object references, even when internal logic seems sound. If you handle BPF or Reuseport sockets in production, patch immediately.
*Stay tuned to your distribution’s security mailing lists and always review upstream kernel bug lists for hidden gotchas like this one!*
Timeline
Published on: 01/31/2025 12:15:29 UTC
Last modified on: 02/03/2025 20:01:29 UTC