Published: June 2021
Last Update: 2024-06-05


Linux kernel vulnerabilities are a big deal for sysadmins, bug hunters, and security researchers. One such bug was CVE-2021-47034, affecting PowerPC 64-bit systems using the Radix MMU, especially for kernel memory updates. Let’s break it down in simple terms: how it happens, how to trigger it, and, most importantly, how it got fixed.

TL;DR

When mapping kernel memory (vmalloc region) on PowerPC's Radix MMU, the kernel forgot a memory barrier or sync, so page table entries (PTEs) weren’t properly synchronized. This could cause totally mysterious and hard-to-debug kernel panics and crashes during code-patching (for example, when using ftrace).

The Problem, Explained

A quick look at Linux memory management: When you allocate memory, the kernel sets up a data structure called a page table entry (PTE). For this to be safe and reliable, updates to these entries need to be synchronized so the CPU sees a valid mapping before accessing that memory.

On PowerPC with Radix MMU, updating PTEs for kernel memory must be followed by a hardware sync (ptesync) to prevent spurious memory access faults — basically, the CPU needs all the changes to settle before using the updated mapping.

However, an important kernel function, map_kernel_page(), missed this step. A related function, flush_cache_vmap(), usually handled the sync, but wasn’t always called. Bad news: if you map kernel memory and use it immediately (like when patching code), you can hit a hard to understand kernel crash.

Technical Details: Where & How it Fails

The kernel function for updating PTEs is radix__set_pte_at(). For performance, it SKIPPED the “make everything visible now” step unless it was non-kernel memory, assuming faults would be handled elsewhere. For kernel memory, no fixup happened, so a “spurious fault” could crash the kernel or cause weird bugs.

A real-world demonstration

BUG: Unable to handle kernel data access on write at xc008000008f24a3c
Faulting instruction address: xc00000000008bd74
Oops: Kernel access of bad area, sig: 11 [#1]
...

This is a showstopper — no error handling for kernel faults.

How Can You Trigger This Bug?

Chris Riedl from IBM found a reliable trigger, using ftrace (the kernel's dynamic function tracer) to cause a lot of code patching:

mount -t debugfs none /sys/kernel/debug
(while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done) &

Under the hood, this maps and patches code over and over, quickly leading to a crash.

Full story in: lore.kernel.org report

Here's the problematic kernel segment (simplified)

static inline void radix__set_pte_at(..., pte_t pte)
{
    WRITE_ONCE(*ptep, pte);
    // MISSING: ptesync here
}

The fix was to add a ptesync (on PowerPC usually via inline assembly)

static inline void radix__set_pte_at(..., pte_t pte)
{
    WRITE_ONCE(*ptep, pte);
    asm volatile("ptesync" ::: "memory");
}

But the kernel maintainers decided to only call it when mapping kernel pages, NOT user pages, for performance.

Exploiting the Bug

This is not a classic privilege escalation or remote code execution. Instead, it is a denial of service (crash) bug — but an attacker with access (or a buggy module) could use features that trigger it and crash the server.

Any process which can rapidly map and patch kernel memory

*PoC trigger:*

# Mount debugfs
mount -t debugfs none /sys/kernel/debug
# Run a loop to rapidly toggle ftrace tracer
(while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done) &

On vulnerable kernels (PowerPC, Radix MMU, 64K page size, probably pre-5.10), this reliably triggers the crash above within minutes.

The Kernel Fix

The Linux kernel maintainers fixed this by making sure a ptesync is enforced for vmalloc mappings. See the official patch series:

- kernel.org patch
- lore.kernel.org discussion

Summary of the patch:
Ensure a hardware sync (ptesync) is always performed for kernel memory mappings, fixing spurious faults during code patching and other sensitive operations.

Takeaways: Protecting Yourself

- If you run Linux on PowerPC64 with Radix: Upgrade to Linux 5.11+ or apply your distro’s security update for CVE-2021-47034.
- If you do live kernel patching or heavy tracing: Double-check your kernel version, or workarounds (don’t use ftrace toggling if possible).
- If you’re a security researcher: This is a nice example of “missing memory barrier/sync = crash” bugs, unique to weakly-ordered memory models like PowerPC.

Further Reading

- CVE page: https://nvd.nist.gov/vuln/detail/CVE-2021-47034
- Original kernel bug report (lore)
- The kernel fix commit

Conclusion

CVE-2021-47034 is a classic kernel architecture race bug: a low-level missing sync that only shows up in the right set of conditions and hardware. If you run Linux/PowerPC with Radix MMU, patch ASAP for stability. If you hunt bugs, this is a gold example of why CPU architectures and memory models matter.

Stay safe, stay patched!

(Co-written & simplified for clarity by a Linux security enthusiast; please see reference links for full details.)

Timeline

Published on: 02/28/2024 09:15:39 UTC
Last modified on: 10/31/2024 15:35:02 UTC