Published: June 2024
Severity: High
Component: Linux Kernel - KVM (Kernel-based Virtual Machine)
Source: Upstream Linux Kernel Commit
Introduction
The Linux kernel’s KVM subsystem is widely used to provide hardware-accelerated virtualization. With the increasing adoption of processor-tracing technologies like Intel® PT (Processor Trace), support for virtualizing these features becomes essential.
However, feature support is only as robust as its implementation.
CVE-2024-53135 exposes significant flaws in KVM’s support for virtualizing Intel PT, leading the kernel developers to lock down the feature behind a BROKEN config option. In short: _Intel PT guest/host virtualization is now disabled unless you go out of your way to enable it—because serious problems were found_.
What is Intel PT and Why is Virtualization Hard?
Intel PT lets you trace exactly what instructions a CPU is executing, which is invaluable for debugging and performance analysis.
Both the host and guest might want to use Intel PT.
- The KVM developer must prevent the guest from accidentally interfering with the host (and vice versa).
- Kernel and CPU states must be properly isolated before/after each VM run.
Failure to do this can mean everything from application crashes inside the VM to host bugs, deadlocks, and data corruption.
What’s the problem?
When you enter a VM, the hardware expects PT tracing to be disabled until it’s ready to hand off control. If the Linux host forgets, or gets things out of order, the hardware private documentation says:
> If the logical processor is operating with Intel PT enabled (if IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load IA32_RTIT_CTL" VM-entry control must be .
KVM doesn’t enforce this. So, if tracing is accidentally left on, the VM-entry can fail. In practice, this can crash or fatally kill the guest VM.
Sample simplified buggy flow
// Not checked: host PT status
kvm_load_guest_pt_state(); // Potentially enables PT tracing before it's safe
vmx_enter_guest(); // Enters guest. Might cause a VM-Entry failure!
Result:
What’s the problem?
KVM lets userspace (like QEMU) specify which features the guest VM thinks it has, via CPUID.
But the KVM code does not validate that the host actually supports these settings. Worse—KVM uses these guest values to decide which MSRs (Model-Specific Registers) to touch when switching VM contexts.
If the guest pretends to have more tracing address ranges than hardware, KVM will attempt to save/restore non-existent MSRs—which can result in:
Sample simplified buggy logic
// Take guest's CPUID ranges count at face value:
for (i = ; i < guest_cpuid_num_ranges; i++) {
save_guest_msr(RTIT_ADDR_AND_MASK_MSR_BASE + i); // Dangerous!
}
The Fix: Hide Intel PT Virtualization Behind CONFIG_BROKEN
Kernel developers decided these issues (there are _many more_) are too risky for production.
So, in the latest release
// arch/x86/kvm/vmx/pmu_intel_pt.c
#ifdef CONFIG_BROKEN
module_param_named(pt_mode, pt_mode, int, 0444);
#endif
Meaning: *Unless you build your kernel with CONFIG_BROKEN=y (which nobody does, outside dev/test...), you cannot virtualize Intel PT anymore.*
- Highly experimental/unusable
Original patch:
Hide KVM’s pt_mode module param behind CONFIG_BROKEN
Exploiting CVE-2024-53135: Proof-of-Concept
Given that this bug allows an attacker in guest userspace (or via a malicious QEMU command) to make KVM access out-of-bounds MSRs, exploitation could lead to host instability or denial-of-service.
Example: Malicious QEMU setup
# QEMU command passing fake CPUID to guest:
qemu-system-x86_64 \
-cpu host,+intel-pt \
-device intel-pt \
-smp 2 \
[...]
Inside your own VM
// In guest OS: request more PT address ranges than hardware supports via CPUID/MSR
wrmsr(RTIT_ADDRn_A, xdeadbeef, ...); // KVM tries to service, gets panic
Host’s dmesg may print errors like
KVM: WARNING: trying to access non-existent MSR x00000560
KVM: ToPA error: unexpected host MSR state
[system hang or deadlock]
Impact
- VM guests can kill or destabilize the host by tricking KVM into mishandling CPU tracing resources.
Don't enable Intel PT virtualization unless you’re testing fixes upstream.
- For downstream distros/vendors, do not patch this restriction out.
If you are a kernel/gadget researcher, only test in throwaway VMs.
Further Reading and References
- Linux commit: Hide KVM's pt_mode param behind BROKEN
- KVM documentation
- Intel® 64 and IA-32 Architectures Software Developer’s Manual (SDM), Vol 3, Section 35.5.11
- QEMU documentation on Intel PT
Conclusion
CVE-2024-53135 is a sober reminder of how risky it can be to virtualize advanced CPU features without bulletproof validation. The Linux community reacted fast—by disabling risky code until it can be remade safe. If you run VMs or clouds, keep updating your kernel, and don’t re-enable this experimental feature!
*Authored for educational and awareness purposes. Stay safe and keep your systems patched!*
Timeline
Published on: 12/04/2024 15:15:13 UTC
Last modified on: 12/19/2024 09:40:03 UTC