CVE-2023-52489 - Race Condition in Linux Kernel’s mm/sparsemem – Exploit Analysis & Fix

A serious race condition vulnerability, now assigned CVE-2023-52489, was discovered and patched in the Linux kernel's memory management (mm/sparsemem). This bug could lead to kernel crashes on systems using specific memory configurations, especially on devices like Qualcomm Snapdragon-based SoCs in configurations mixing device and system memory.

This long-read explains what went wrong, how it was discovered, the impact, relevant code snippets, reference links, and how this vulnerability was effectively fixed. We aim for a simple language presentation, with an emphasis on technical clarity for both security professionals and curious developers.

1. Background and Where It Happened

In Linux, memory is divided into sections and zones. On some ARM devices and newer SoCs, memory layouts can look like:

[ZONE_NORMAL] [ZONE_DEVICE] [ZONE_NORMAL]

Here, device memory might be placed in between "normal" memory zones. The Linux memory model, with CONFIG_SPASEMEM_VMEMAP=y (sparse memory with virtual memory map), uses the array memory_section->usage to track section states.

Problem:
If one CPU is compacting (defragmenting) memory, and another is hot-unplugging (removing) device memory, a race can occur:

Thread 2: Starts removing a memory section, frees its ->usage, and sets it to NULL

If the check and free happen together, Thread 1 can try to use a NULL pointer.

From the kernel logs

Unable to handle kernel NULL pointer dereference at virtual address 000000000000000
...
pc : __pageblock_pfn_to_page+x6c/x14c
lr : compact_zone+x994/x1058

This points right to the memory section usage pointer.

Illustrated (Pseudocode):

// Thread 1: Compacting memory
if (pfn_valid(pfn)) {                  // Returns true
    if (pfn_section_valid(pfn)) {      // Uses ms->usage, could be NULL
        // Access ms->usage here!
    }
}

// Thread 2: Hot-unplug device memory
sparse_remove_section() {
    section_deactivate() {
        kfree(ms->usage);              // Frees usage array
        ms->usage = NULL;              // Sets pointer to NULL
    }
}

Original Patch Discussion & Crash Logs:

https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/

Fix in Upstream Kernel:

Commit details on git.kernel.org

This bug is a race condition, not a direct privilege escalation. Here’s why it matters

- If exploited (even accidentally, like during memory hotplug or device driver operations), it causes a kernel crash.

Repeated crashes could be used for denial of service (DoS).

- Theoretically, a skilled attacker could try to arrange memory timings to cause further undefined behavior or gain a primitive to escalate (e.g., if callbacks were misused), but that’s not proven as of June 2024.
- The primary risk is reliability and stability, especially in device farms, servers with hot-plugging memory, and ARM SoCs with complex memory maps.

Before freeing the pointer, clear the flag that marks the section as valid

ms->section_mem_map_flags &= ~SECTION_HAS_MEM_MAP;

b) Use RCU (Read-Copy Update) for Protection

The pointer is freed using RCU (kfree_rcu()), ensuring other CPUs finish their reads before memory is released:

void *old_usage = ms->usage;
ms->usage = NULL;
kfree_rcu(old_usage, rcu_head);

c) Ordering Guarantees

- Once the section flag is cleared, further checks for validity (pfn_valid()) will return false; no one will try to dereference ->usage anymore.

6. Snippet from the Actual Fix

You can find the fix here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=95fec71adbbe1fe8b1fcd866bd9c8c9b687b7b76

Pseudo-diff overview

// Old code
kfree(ms->usage);
ms->usage = NULL;
ms->section_mem_map_flags &= ~SECTION_HAS_MEM_MAP;

// New code
ms->section_mem_map_flags &= ~SECTION_HAS_MEM_MAP; // Flag cleared *first*
void *usage = ms->usage;
ms->usage = NULL;
kfree_rcu(usage, rcu_head); // Free with RCU for safety

## 7. How to Stay Safe (Mitigation/Detection)

- Update Your Kernel: If you rely on hotplug memory, device memory, or complex memory zoning, patch your kernel ASAP.
- Watch Dmesg Logs: Crashes with pageblock_pfn_to_page or "Unable to handle kernel NULL pointer dereference" may indicate this bug.

Hardening: Consider enabling RCU-debug and memory hardening configs.

- Cloud/SOC Vendors: Deploy automated testing for memory hot-plug scenarios.

8. Summary Table

| CVE | Component | Type | Impact | Fixed Kernel |
|------------------|-------------------|-------------------|--------------------|------------------|
| CVE-2023-52489 | Linux mm/sparsemem| Race Condition | Kernel Crash (DoS) | v6.9, 6.1.77 LTS |

9. Conclusion

CVE-2023-52489 demonstrates how subtle and timing-related memory management bugs can have big consequences in the kernel. The fix teaches us the importance of proper data access ordering and safe memory freeing with synchronization (RCU).

If you manage systems with device memory hotplug or are running kernels on ARM SoCs, especially Snapdragon, upgrade immediately.

References

- Upstream Discussion
- Linux commit 95fec71adbbe1fe8b1fcd866bd9c8c9b687b7b76
- CVE Record *(pending full update as of publishing)*

Timeline

Published on: 03/11/2024 18:15:16 UTC
Last modified on: 02/14/2025 16:41:06 UTC