CVE-2021-46936 - Use-After-Free in Linux Kernel's `tw_timer_handler` Explained
Summary:
CVE-2021-46936 is a critical use-after-free vulnerability in the Linux kernel, affecting how IPv4 network statistics are handled during the destruction of a network namespace. If left unfixed, attackers or faulty processes might trigger kernel panics or potentially escalate privileges. In this post, we’ll walk through what happened, how the bug works, show code snippets, and explain the patch that resolved it.
What Happened? (The Panic)
Back in Linux kernel 5.4, some users reported sudden kernel panics during network namespace cleanup. The error stack looked like this:
BUG: unable to handle page fault for address: ffffde49a863de28
RIP: 001:tw_timer_handler+x20/x40
Call Trace:
<IRQ>
call_timer_fn+x2b/x120
run_timer_softirq+x1ef/x450
__do_softirq+x10d/x2b8
irq_exit+xc7/xd
smp_apic_timer_interrupt+x68/x120
apic_timer_interrupt+xf/x20
The kernel crashed because it accessed memory that had already been freed: a classic use-after-free.
Root Cause: Namespace Cleanup Order
The problem lies in the order that network subsystems were being torn down when a network "namespace" (an isolated network stack, e.g. used for containers) was destroyed. Two pieces are involved:
ipv4_mib_ops: Manages IPv4 statistics (like packet counts).
When namespaces are destroyed, they are cleaned up in pernet_list order. Because of a registration order issue, tcp_sk_ops would still be running time-wait timer handlers after ipv4_mib_ops had already freed the IPv4 statistics memory. If a lingering timer tried to update a now-freed stats memory, kernel panic.
Simple Illustration
// Pseudo-order during namespace teardown:
ipv4_mib_exit_net(); // Frees net->mib.net_statistics
tcp_sk_exit_batch(); // tw_timer_handler runs, accesses freed memory!
The Vulnerable Code Path
The core bug was triggered in tw_timer_handler(), which would use net->mib.net_statistics even after it had been freed by ipv4_mib_exit_net().
Vulnerable snippet (kernel/net/ipv4/tcp_timer.c)
void tw_timer_handler(struct timer_list *t)
{
struct inet_timewait_sock *tw = from_timer(tw, t, tw_timer);
struct net *net = twsk_net(tw);
/* ... */
SNMP_INC_STATS(net, LINUX_MIB_TIMEWAITED);
/* ... */
}
SNMP_INC_STATS() accesses the possibly-freed statistics structure
#define SNMP_INC_STATS(net, field) \
(net->mib.net_statistics[field]++)
If net_statistics is already freed, this is a use-after-free, risking arbitrary memory corruption or a panic.
Public Reports & Timeline
As detailed in the syzkaller bug thread, this kernel bug was observed and reported as early as 2017. Multiple fixes were attempted, but the root order-of-operations error persisted until the proper registration fix.
Why Did This Happen? (Technical Details)
This bug was *not* introduced by making stats updates per-namespace, but rather by moving the stats allocation from a global to a dynamically allocated (per-namespace) structure. Specifically, commit 61a7e26028b9 moved stats onto each struct net.
Because tcp_sk_ops and ipv4_mib_ops register themselves in the kernel for teardown in different orders, and because timers may still be running (asynchronously), freeing the statistics while timers could still access them created a classic race condition.
The Fix
The resolution was straightforward: ensure init_ipv4_mibs() runs before tcp_init() so that registration/teardown ordering lines up--the opposite during destruction. This ensures all TCP-related handlers finish before statistics are freed.
Patch Excerpt
// In net/ipv4/af_inet.c:
static int __init inet_init(void)
{
int err;
err = init_ipv4_mibs();
if (err)
panic("Failed to initialize IPv4 mibs!\n");
err = tcp_init();
if (err)
goto cleanup_mibs;
// ... rest omitted ...
}
Additionally, instead of pr_crit(), the fix now uses panic() if initialization fails, ensuring a hard stop.
Commit Reference:
- net: fix use-after-free in tw_timer_handler (kernel.org)
Cause TCP connections to be left in the TIME-WAIT state while tearing down a namespace.
A race can be triggered where pending timers for these TCP sockets run after the stats memory is gone.
Exploit scenario:
A malicious program repeatedly creates and destroys network namespaces while ensuring TCP connections stay open and enter TIME_WAIT. By carefully timing the teardown, it could force access to freed memory, leading to a kernel crash.
Watch for kernel panic or use-after-free in dmesg.
> Note: There's no public PoC for privilege escalation, but reliable kernel crash/"panic" is very possible.
References & Further Reading
- Kernel patch on kernel.org
- Original 2017 bug report via syzkaller
- Commit 61a7e26028b9
Final Thoughts
CVE-2021-46936 is a great example of how resource cleanup and async event handling can lead to serious kernel bugs. Even innocuous timer handlers must be careful about lifetimes of objects and teardown order. For users and admins, the solution is simple: keep your kernels up to date!
If you have questions about kernel security, container hardening, or want to see code-level analysis for similar bugs, let us know in the comments below!
Timeline
Published on: 02/27/2024 10:15:08 UTC
Last modified on: 04/10/2024 19:20:08 UTC