Recently, a vulnerability was identified and resolved in the Linux kernel, specifically within the vmscan memory management code. This article discusses the details of this vulnerability, presents a code snippet illustrating the issue, and refers to the original references on how the vulnerability was discovered and mitigated.

The Vulnerability: Infinite Loop in throttle_direct_reclaim()

The core issue in the Linux kernel arises when a task continuously loops in throttle_direct_reclaim(), as allow_direct_reclaim(pgdat) keeps returning false. This occurs due to the node (pgdat) being regarded as balanced, despite facing pressure in zones like ZONE_NORMAL. This pressure is masked by other zones, such as ZONE_DMA32, which may have sufficient free pages to meet their watermarks. As a result, the kernel hangs as the task gets stuck in throttle_direct_reclaim().

To illustrate the issue, consider the following call stack taken from the actual crash report

 # [ffff80002cb6f8d] __switch_to at ffff8000080095ac
 #1 [ffff80002cb6f900] __schedule at ffff800008abbd1c
 #2 [ffff80002cb6f990] schedule at ffff800008abc50c
 #3 [ffff80002cb6f9b] throttle_direct_reclaim at ffff800008273550
 #4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68
 #5 [ffff80002cb6fae] __alloc_pages_nodemask at ffff8000082c466
 #6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98
 #7 [ffff80002cb6fca] do_anonymous_page at ffff80000829f5a8
 #8 [ffff80002cb6fce] __handle_mm_fault at ffff8000082a5974
 #9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4

The Patch: Account for Free Pages in zone_reclaimable_pages()

To fix this vulnerability, a patch was developed that ensures free pages are included in the calculation performed by zone_reclaimable_pages() when no other reclaimable pages (such as file-backed or anonymous pages) are available. This change ensures that zones like ZONE_DMA32, which have sufficient free pages, are not mistakenly regarded as unreclaimable. Consequently, the patch promotes proper node balancing, avoids masking pressure on other zones, and prevents infinite loops in throttle_direct_reclaim() caused by allow_direct_reclaim(pgdat) returning false repeatedly.

Original References and Exploit Details

The vulnerability and its resolution were discussed in the Linux kernel mailing list. You can find the detailed explanation and commit report at the following links:

1. Commit Report: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fef30591a9272dbbe6fccf8dbe728178b152ef8d
2. Mailing List Discussion: https://lkml.org/lkml/2022/1/13/880

In conclusion, the CVE-2024-57884 Linux kernel vulnerability involved a task getting stuck in an infinite loop in throttle_direct_reclaim() due to improper node balancing. The problem was resolved by considering free pages in the calculation performed by zone_reclaimable_pages(). By addressing this vulnerability, the Linux kernel has become more resilient to hangs arising from memory management issues.

Timeline

Published on: 01/15/2025 13:15:12 UTC
Last modified on: 01/20/2025 06:28:51 UTC