CVE-2024-39483 - Critical Vulnerability in Linux Kernel’s KVM NMI Handling

A recently patched flaw, CVE-2024-39483, affected the Linux kernel’s KVM (Kernel-based Virtual Machine) subsystem, specifically targeting how NMIs (Non-Maskable Interrupts) are handled when virtualizing AMD processors with SVM (Secure Virtual Machine) support. This long-read dives into the nature of the bug, why it mattered, the technical core of the problem, and how the patch addresses the risk. We’ll also review potential exploit scenarios and show example code along the way.

What Is CVE-2024-39483 All About?

NMI handling is a fundamental part of hardware interrupt management — NMIs, by their very nature, are designed *not* to be masked or suppressed. However, in virtual environments (like KVM), emulation and edge-cases introduce subtle and risky behaviors.

The flaw was discovered in the combination of *virtual NMI support* (vNMI) and the way KVM requests "NMI windows" when NMIs can potentially collide (e.g., the guest is already handling one). The bug could trigger under unusual yet possible timing: for example, if the vCPU is preempted in an “STI shadow” (after STI, interrupts are still temporarily masked), or if the Guest Interrupt Flag (GIF) is cleared.

If abused, this could potentially impact the *isolation* of virtual machines, creating a gap where malicious code on a guest could influence or predict processor state in subtle ways or cause hypervisor instability.

Original Patch Reference

- Patch (kernel.org)
- LKML discussion: WARN on vNMI + NMI window iff NMIs are outright masked

Why Does This Matter?

In bare-metal hardware, two NMIs almost *never* overlap due to how short the shadowing window is. But, inside a hypervisor context:

The second is either injected directly or must be pended

An attacker could attempt to manipulate the vCPU state by flooding NMIs, increasing the odds of hitting the vulnerable state, possibly leading to incorrect hypervisor handling or information leakage due to mishandling pending NMI bits.

KVM’s NMI Injection Logic (Before Patch)

if (enable_vnmi) {
    if (vcpu->arch.nmi_pending) {
        /* Pending NMI exists, vNMI will take care of it */
        vmcb->control.int_control |= V_NMI_PENDING;
    }
} else {
    /* Traditional NMI window handling */
    kvm_make_request(KVM_REQ_NMI_WINDOW, vcpu);
}

However, when NMIs are masked (the guest is processing an NMI, or in a weird state like GIF=), KVM could _still_ go through an NMI window request (which is only sensible if NMIs are masked due to handling, not due to flags).

The patch introduced a stricter check

- Only Warnings if vNMI is enabled *and* NMIs are outright masked (i.e., the guest is already handling an NMI)
- No ambiguity allowed between masked-by-handling vs masked-by-STI/GIF=

Key difference: Avoids divergent behavior whether vNMI is on or off, and is less trigger-happy about spurious warnings.

### Patch Code Snippet (from commit 7f16296b2413…):

if (vcpu->arch.v_nmi_blocking) {
    WARN_ON(enable_vnmi && !nmi_blocked_for_non_vnmi_reasons());
    kvm_make_request(KVM_REQ_NMI_WINDOW, vcpu);  // Only if truly masked
}

This guards against requesting an NMI window when the condition isn’t “out-and-out” NMI-masked. It aims to prevent redundant/kind-of weird requests that might trip up the logic.

The vulnerability does not provide direct code execution. However, it

- Allows a guest to hammer the NMI or GIF= windows to possibly desynchronize hypervisor/guest state
- Might leak timing or behavioral differences between vNMI enabled/disabled, useful for attacks on virtualization boundaries

Underlies possible future DoS or “confused deputy” scenarios, if state-pending logic is abused

### Minimal Reproduction/Theoretical Exploit

An attacker would attempt to flood the guest with back-to-back NMIs while actively trying to keep GIF= or stay within the STI shadow:

// Pseudocode: Hammer NMI when we control GIF= state
while (1) {
    asm volatile ("cli");       // Clear Interrupt Flag (GIF=)
    send_nmi();                 // Send synthetic NMI via hypervisor API
    asm volatile ("sti");       // Set Interrupt Flag (GIF=1)
    pause();
}

This isn't a weaponized exploit, but repeated behavior can increase chances of hitting the narrow window where KVM must choose how to handle the NMI edge case.

In Summary

- CVE-2024-39483 involved subtle mishandling of NMI window requests under rare (but possible) masked conditions (STI/GIF=) in KVM’s vNMI.

It doesn’t break isolation by itself, but widens the gap for timing-based bugs or VM state desync.

- The patch ensures the logic only triggers for *truly* masked NMIs, unifying vNMI and non-vNMI handling.

References

- LKML Patch Thread
- Upstream Kernel Patch

Conclusion

This issue is a great example of how even obscure corner-cases in hypervisor interrupt logic can become meaningful security bugs, especially in high-density cloud environments. If you run multi-tenant clouds or allow untrusted code in VMs, update your kernel as soon as possible. This patch ensures clean, hardware-faithful handling for NMIs — closing the gap on NMI window abuse.

Timeline

Published on: 07/05/2024 07:15:10 UTC
Last modified on: 07/15/2024 06:50:19 UTC