Published: 2024-06 <br>Author: Linux Security Expert

What is CVE-2024-50082?

CVE-2024-50082 is a critical race condition vulnerability found in the Linux kernel’s block layer “request queue quality of service” (blk-rq-qos) code. This bug, affecting kernel version 6.12 (and possibly earlier), could cause kernel panics (crashes) when a highly unlikely, but possible concurrency bug is triggered.

If you're running Linux on a server or VM, especially with heavy disk IO, this is stuff you should know.

The Crash in Human Terms

When block IO is throttled or managed, the kernel sometimes puts IO tasks to sleep, waiting for “tokens” to proceed. Two key functions — rq_qos_wait() (to sleep until a token is granted) and rq_qos_wake_function() (to give another waiter a token and wake it up) — work together.

The Race:
Imagine two threads:

Waker: Grants a token and tries to wake a waiter (rq_qos_wake_function()).

A bug in handling their wait entry leads to the waiter freeing or reusing its stack data while the waker still tries to access it. If the waker tries to wake up a task at a memory location that’s already been repurposed, the kernel crashes.

Crash output

BUG: unable to handle page fault for address: ffffafe180a40084
#PF: supervisor write access in kernel mode
...
RIP: 001:_raw_spin_lock_irqsave+x1d/x40
...
Call Trace:
 <IRQ>
 try_to_wake_up+x5a/x6a
 rq_qos_wake_function+x71/x80
 __wake_up_common+x75/xa
 __wake_up+x36/x60
 scale_up.part.+x50/x110
 wb_timer_fn+x227/x450
 ...

The cause?
- The waking function reads the waitqueue data *after* deleting it from the queue, opening a tiny time window for the waiter to return and reuse its stack, leading to the classic “use-after-free” crash.

Let's see step-by-step what went wrong (pseudocode style)

// Thread 1: Waiter (rq_qos_wait)
prepare_to_wait_exclusive(&wait_queue, &wait_entry, ...);
// ... gets scheduled away, or is going to return soon

// Thread 2: Waker (rq_qos_wake_function)
data->got_token = true;
list_del_init(&curr->entry); // BAD: deletes from list now

// Meanwhile, Waiter resumes:
if (data.got_token)
    break;
finish_wait(&wait_queue, &data.wq); // returns immediately as entry is gone
// ...no longer waiting, stack 'data' re-used elsewhere...

// Waker continues (still with pointer to stack memory!):
wake_up_process(data->task); // Oops! Might be stack garbage now!

The Proper Fix

According to the official patch, the fix is to swap the operations order in rq_qos_wake_function():

Then: Delete the waitqueue entry (list_del_init_careful()).

This way, the waiter’s stack data is not reclaimed until after it is truly woken.

Reference: Official Patch

- Linux kernel mailing list discussion

Fixed code excerpt

// OLD, buggy order:
data->got_token = true;
list_del_init(&curr->entry);    // <-- entry is unhooked too soon
wake_up_process(data->task);    // <-- data may be gone!

// NEW, safe order:
data->got_token = true;
wake_up_process(data->task);    // <-- safely wake while data valid
list_del_init_careful(&curr->entry); // <-- now unlink the entry

Exploitability and Impact

While this bug cannot be reliably exploited to gain root or run code (since it's a crash), a malicious or unlucky user/process could *intentionally* trigger heavy disk IO and time thread scheduling just right to bring down the system.

- Denial of Service: Most practical impact is system crash (“oops”), leading to loss of data and uptime.

No Escalation: No known ability to escalate privileges.

In cloud, virtualized, or multi-user shared hosting, this could be used to *repeatedly crash* a system, impacting reliability.

Update kernels:

Apply your distributor's latest kernel patches. The official fix landed May 2024.

References

- CVE-2024-50082 on Mitre
- Official patch & explanation on LKML
- Linux Block Subsystem GIT commit

In Summary

CVE-2024-50082 is a rare, subtle but dangerous race in Linux's block IO token wait logic. It can result in a hard crash if unlucky timing hits — and that's not just theory: real-world crash traces show it occurs. Patches are available now. If you care about stability, review, test and update as soon as possible!


If you found this breakdown helpful, share with fellow sysadmins and kernel hackers — and always keep an eye out for race conditions in concurrent code!

Timeline

Published on: 10/29/2024 01:15:05 UTC
Last modified on: 10/30/2024 15:44:05 UTC