CVE-2024-26598 - Use-After-Free (UAF) in Linux KVM ARM64 `vgic-its` LPI Translation Cache

On certain ARM64 systems running Linux with KVM virtualization, a vulnerability (tracked as CVE-2024-26598) was found and fixed in the way the vgic-its component of the kernel handled LPI (Locality-specific Peripheral Interrupt) translation caching. If left unpatched, this vulnerability could allow a local attacker (possibly a hostile VM guest) to trigger a use-after-free (UAF) bug, potentially escalating privileges or causing a denial of service.

Where’s the Problem?

The issue lies in the ITS (Interrupt Translation Service) emulation in KVM's ARM64 implementation. Specifically, when handling interrupt translation requests with caching, there’s a race between threads hitting the cache and other threads invalidating it (for example, due to a DISCARD/REMOVE command).

The function vgic_its_check_cache() returned a cache object (vgic_irq) while still allowing another thread to free it after releasing a lock — leading to possible use-after-free memory errors.

Before the fix, code looked like this (simplified for clarity)

struct vgic_irq *vgic_its_check_cache(...) {
    // ... (lock acquired)
    struct vgic_irq *irq = find_in_cache();
    // ... (lock released)
    return irq;
}

// Later usage
irq = vgic_its_check_cache(...);
// <-- irq might be freed at this point!
handle_irq(irq);

If another operation (like a DISCARD command) runs just after the lock is dropped but before handle_irq() uses irq, the memory belonging to irq might already be freed — classic UAF.

The Patch: How Was It Fixed?

The kernel fix ensures reference counting is elevated before releasing the lock. Now, vgic_its_check_cache() increments the reference counter on the cached interrupt object (vgic_irq) before the lock is unlocked and decrements (releases) it when processing is complete.

Simplified pseudocode after the fix

struct vgic_irq *vgic_its_check_cache(...) {
    // ... (lock acquired)
    struct vgic_irq *irq = find_in_cache();
    if (irq)
        refcount_inc(&irq->refcount); // <-- bump refcount!
    // ... (lock released)
    return irq;
}

// Later usage
irq = vgic_its_check_cache(...);
// irq is guaranteed to be valid
handle_irq(irq);
// When done:
refcount_dec(&irq->refcount); // <-- release

Now, even if another thread tries to free the object, it won’t disappear until all users are done, preventing UAF.

Reference

- Kernel Patch Commit
- NVD CVE-2024-26598 Record

With careful timing, memory reuse, and possibly spraying techniques, this could cause

- Kernel panic (crash/DoS)
- Privilege escalation (code execution in kernel context), depending on system config and other factors

A rough idea of how this might look in a malicious guest/VM (pseudocode)

from threading import Thread

def send_lpi_hits():
    # Loop: repeatedly trigger LPI interrupts
    for _ in range(10000):
        send_lpi_interrupt()

def invalidate_cache():
    # Loop: repeatedly send DISCARD command
    for _ in range(10000):
        send_discard_command()

# Run both "racing" threads
Thread(target=send_lpi_hits).start()
Thread(target=invalidate_cache).start()

With enough iterations and luck (or heap spraying), UAF may result.

Who is affected? Any ARM64 system running Linux with KVM and using ITS emulation (GICv3).

- When did this get fixed? See kernel commit — merged early 2024.
- What should I do? Upgrade Linux to a version that includes this patch, especially if you allow untrusted guest VMs on ARM64 hosts.

More Reading

- Official Linux kernel patch
- NVD Database: CVE-2024-26598
- KVM ARM64 Documentation

Summary

CVE-2024-26598 is a subtle but serious bug in the ARM64 KVM code that could allow a guest VM to crash or compromise the host. It's typical of the kinds of complex bugs that crop up in highly-concurrent, performance-tuned kernel code. If you're managing ARM64/KVM systems, patch your kernels now.

Stay safe—protect your hosts, even from your guests!

Timeline

Published on: 02/23/2024 15:15:09 UTC
Last modified on: 04/17/2024 19:40:31 UTC