CVE-2021-47069 - Deep Dive Into a Subtle Linux Kernel Race Condition Exploit

Keywords: Linux kernel, CVE-2021-47069, Message Queues, Race Condition, Stack Reference, Exploit, Vulnerability, Security

The Linux kernel is well-known for its robustness, but even here, rare and subtle bugs can sneak in, especially in complex subsystems like Inter-Process Communication (IPC). CVE-2021-47069 is a fascinating example: an elusive, stack-based race condition exploitable (under certain conditions) through message queues (mqueue), message (msg), and semaphore (sem) subsystems.

In this long read, we'll walk through what happened, how the exploit could work, some kernel code snippets, the exact kernel patch that closed it, and why even rare race bugs matter so much in kernel security.

Summary of CVE-2021-47069

- CVE ID: CVE-2021-47069

Impact: Kernel crash (DoS). Potential risk for privilege escalation in custom kernel builds.

- Fixed in: Commit 3e65b8d7f8 (mainline)

High-Level Description

The vulnerability exists because the kernel's IPC code, specifically in message handling (mqueue, msg, sem), passes a pointer to a local stack variable (a structure representing a "wait queue element") between sender and receiver tasks.

Normally, this would be safe if the variable's lifetime always outlasted any reference to it, but there is a rare race: the sender can use the pointer *after* the receiver's function has returned, meaning it points to invalid memory (a "use after free" of a stack slot). If the memory is modified or reused, this results in a crash or, potentially, something worse.

This bug was extremely difficult to exploit directly - but in security, *rare* ≠ *safe*.

Exploit Scenario: Step-by-Step Crash Walkthrough

Let's break down how an attacker (with local code execution) might trigger this.

1. Setup the Race:

User A (receiver) calls mq_timedreceive. This adds a wait-queue object on A's stack (say at address x7fff...a100) and records that address in the kernel's wait queue.

2. Enter Kernel Paths:

While A is waiting, B (sender) calls mq_timedsend. The kernel looks at the receiver's wait queue and fetches the pointer A gave it (still stack-allocated).

3. Race Window:

The sender marks the receiver's wait object as "ready" but does *not* use it immediately. Instead, the receiver wakes up, returns from the syscall (stack frame is gone), so x7fff...a100 is now unowned memory.

4. Use-after-free:

Now, the sender code *finally* uses that pointer, assuming it's still a valid object, and tries to access a field like .task (expecting struct task_struct *). The stack memory likely contains garbage by now—leading to a kernel crash.

Observed Crash Trace:

  RIP: 001:wake_q_add_safe+x13/x60
  Call Trace:
   __x64_sys_mq_timedsend+x2a9/x490
   do_syscall_64+x80/x680
   entry_SYSCALL_64_after_hwframe+x44/xa9
  RIP: 0033:x7f5928e40343

This is the classic signature of using a pointer (here, a task_struct *) that does not belong to a valid kernel structure.

Let’s look at a simplified relevant kernel logic (before the fix)

// Receiver (do_mq_timedreceive)
struct ext_wait_queue ewq;
ewq.state = WAITING;
wq_sleep(&ewq); // adds pointer of stack variable to global waitqueue

// Sender (do_mq_timedsend) - in pipelined_send
struct ext_wait_queue *waiter = get_first_waiter();
smp_store_release(&waiter->state, STATE_READY); // signal wake

// Window: if receiver returns here, ewq is now garbage, but...
wake_q_add_safe(&waiter->task, ...); // <-- uses the stack pointer after return!

The waiter pointer is valid only while the receiver is inside the kernel function. If the sender dereferences after that, anything can happen.

The Kernel Fix: Don't Trust the Stack Pointer

The fix is sensible: copy out the relevant pointer (task_struct*) before the window. Instead of accessing an object whose memory might be reclaimed, just *remember* what you actually need.

Link: Fix Commit

Patch snippet

// BEFORE (bad):
wake_q_add_safe(&waiter->task, ...);

// AFTER (good):
struct task_struct *task = get_task_struct(waiter->task);
// ...
wake_q_add_safe(&task, ...);

Now, even if the original wait-queue struct is gone, the pointer to the kernel task is safe—no stack-based race.

This pattern repeats in ipc/msg.c and ipc/sem.c, wherever stack-based wait queues and similar dereferencing happened.

Proof of Concept: Triggering the Bug

The exploit is *very* timing-sensitive, but you can approximate a crash with two threads hammering mq_timedsend and mq_timedreceive on the same queue. With some patience and luck, you might see a kernel panic:

// pseudocode for two-thread hammer:
// Thread A - Receiver:
while (1) {
    mq_timedreceive(mq, buf, sizeof(buf), NULL, &timeout);
}

// Thread B - Sender:
while (1) {
    mq_timedsend(mq, buf, sizeof(buf), , &timeout);
}

On an unpatched kernel, running this on multiple CPUs would occasionally crash with an oops mentioning wake_q_add_safe.

Many security boundaries in Linux rely on correct kernel memory isolation and robust queue handling.

- Even if only leading to crash/DoS, such bugs sometimes open up further avenues for exploitation (especially with kernel heap sprays, pointer reuse tricks, etc.).

References

- CVE-2021-47069 NVD Entry
- Linux Kernel Commit Fixing the Bug
- Full linux kernel mailing list report & patch

Conclusion

CVE-2021-47069 shows how even a tiny misassumption in kernel code—using a pointer after its lifetime has expired—can turn into a crash, or worse. The Linux kernel team closed this subtle gap, but it’s a reminder for anyone writing multi-threaded or concurrent code: don’t trust stack pointers or any resources after they’re out of scope.

If you’re a distro maintainer or running a custom kernel, make sure your kernel is updated with this fix!

_If you found this read helpful, share and stay tuned for more kernel vulnerability breakdowns explained for real humans._

Timeline

Published on: 03/01/2024 22:15:46 UTC
Last modified on: 01/09/2025 18:21:01 UTC