Date: June 2024
Author: [YourName]


The Linux kernel’s KVM (Kernel-based Virtual Machine) lets you run virtual machines efficiently. But, like all complex software, sometimes things slip through. CVE-2024-26991 is one of those “sneaky” bugs that can let the kernel accidentally read memory it owns—but in a way that can cause instability, possible information leaks, or just unexplained crashes.

Below, I’ll break down this bug in plain language, show how it happens (with code snippets!), explain how it was fixed, and give resources to read further. If you’re managing Linux hosts running VMs or hacking the kernel, read on!

What is CVE-2024-26991?

Summary:
A bug was found in KVM’s handling of memory attributes for VMs on x86, specifically in how it tracks and marks which guest-physical addresses (GFNs) support large (“huge”) pages. When using the KVM_SET_MEMORY_ATTRIBUTES ioctl, KVM could accidentally read out-of-bounds from an internal vmalloc-allocated array (lpage_info). This read could trigger Kernel AddressSanitizer (KASAN) warnings or worse, potentially destabilizing the VM or leaking info.

Why Does This Happen?

Short version: An internal optimization assumes more indexing entries than really exist. When mapping some memory attributes, KVM tries to use the history of 1-level-huge-page entries to make decisions about larger 2-level-huge-pages—without realizing that the array doesn’t actually cover every possible guest frame number (GFN) in the larger range. In cases with odd-sized memory regions (“memslots”), this means reading memory just past the end of the array—what we call a heap (vmalloc) buffer over-read.

The Internals (With Code Snippets!)

KVM uses a data structure kvm_lpage_info to record *which* parts of guest memory can use which hugepage sizes. Each memslot has multiple lpage_info arrays—one per supported hugepage size. For each GFN, one kvm_lpage_info entry tracks a special “mixed-attributes” status bit (KVM_LPAGE_MIXED_FLAG) and a count.

All this feeds into the logic for enabling/disabling hugepages across various boundaries (most importantly: don’t let a hugepage cross a memslot boundary, or contain areas with inconsistent attributes).

The Crux

- When marking a region as “mixed” (so no hugepage allowed), KVM has to loop over the lpage_info arrays for the relevant pages.
- As an optimization, if the underlying, smaller hugepages (“level-1”) are all consistent, assume the larger page (“level”) is also good.
- But: Not all possible hugepages are tracked in these arrays! Head/tail pages at the slot boundary may be missing.

Here’s the essence of what goes wrong (simplified)

// 'level' represents hugepage size (e.g., 2MB, 1GB)
// 'gfn' is the guest frame number, 'memslot' describes guest memory region

struct kvm_lpage_info *linfo = memslot->arch.lpage_info[level];
if (hugepage_has_attrs(memslot, level - 1, gfn, attr)) {
    // optimization: skip slow path if lower level pages are consistent
}

Inside hugepage_has_attrs, it does

for (i = ; i < npages; i++) {
    struct kvm_lpage_info *lpinfo = lpage_info[level - 1];
    if (lpinfo[start_gfn + i].flags & KVM_LPAGE_MIXED_FLAG) {
        return false;
    }
}

But if start_gfn + i is beyond the end of the array, out-of-bounds!

You get

BUG: KASAN: vmalloc-out-of-bounds in hugepage_has_attrs+x7e/x110
Read of size 4 at addr ffffc900000a3008 by task private_mem_con/169
Call Trace:
  ...
  hugepage_has_attrs
  kvm_arch_post_set_memory_attributes
  kvm_vm_ioctl

Exploitability

- Local VM Managers: A malicious or buggy process that uses KVM ioctls could trigger OOB reads. Usually, these are unprivileged or restricted, but on custom deployments this could be attacker-controlled.
- System Security: While this mainly causes faults and KASAN splats, out-of-bounds reads theoretically could expose adjacent kernel memory if the wrong circumstances occur.

Denial of Service: At minimum, can crash the host kernel if triggered repeatedly.

No public exploit code for this, since it needs custom kernel/userland and KASAN for proof, but the bug is real and observable.

The Fix (Patched Code)

The upstream Linux kernel has been patched. The main change: always check bounds before reading array entries, and avoid the wrong assumptions about boundary entries.

Patch Excerpt

if (index_within_bounds(...)) {
    // safe to access
    if (lpinfo[index].flags & KVM_LPAGE_MIXED_FLAG) {
        return false;
    }
}

Check out the official fix:
- Upstream commit 18502bb8d7f ("KVM: x86/mmu: x86: Don't overflow lpage_info when checking attributes")
- Kernel.org CVE entry

References

- Linux kernel commit message/fix
- KVM kernel code: arch/x86/kvm/mmu/mmu.c
- KVM selftests: private_mem_conversions_test
- Linux KASAN documentation
- Debian Security CVE page for CVE-2024-26991

What Should You Do?

- Upgrade your kernel! Patches have landed in all upstream supported series. If you use KVM for nested virtualization or cloud workloads—patch ASAP.
- Monitor your KVM management code: Avoid complex KVM_SET_MEMORY_ATTRIBUTES patterns on unaligned, small memslots where possible.
- For kernel devs: Validate array accesses, especially with multi-level page tables and complex optimizations.

If you run untrusted workloads or provide KVM as-a-service, this bug especially concerns you.

In Closing

CVE-2024-26991 is a great example of how *optimizations* in low-level code can backfire when assumptions about array boundaries or memory regions are wrong. While not a classic remote code execution bug, this sort of vulnerability is worth fixing fast. As always, keep up with kernel updates and review security bulletins!

Feel free to share or link to this post. Stay safe, and happy hacking!

Timeline

Published on: 05/01/2024 06:15:16 UTC
Last modified on: 11/07/2024 18:35:07 UTC