---
Quick Summary
CVE-2024-26627 is a vulnerability in the Linux kernel's SCSI subsystem, where inefficient handling of the busy-count check inside a locked section could cause heavy lock contention and even hard lockups. This flaw especially hits systems with a large number of hardware queues and deep queue depths. A kernel update has since fixed this.
Below, I’ll break down what happened, how it was fixed, and how you might exploit or detect this bug. Real code snippets are provided for a practical understanding, with links to key resources.
What is the SCSI Host Lock Problem?
Linux SCSI core uses an error handler (EH) to recover from faults. When something goes wrong with a request, the kernel wakes up EH threads to sort things out. The code checked if there were busy requests while still holding a lock. With high queue counts and deep queues, checking if the SCSI host is busy (scsi_host_busy()) could take a long time—making every waking of error handling threads sluggish.
The problem is each check is done while holding a lock, multiplying wait time and CPU load for many threads and causing significant slowdowns or even a kernel hard lockup—where the system seems to freeze.
Here’s a basic sketch** (not the exact code from the kernel, but simplified for clarity)
// Pseudo code for scsi_eh_wakeup() before the fix
void scsi_eh_wakeup(struct Scsi_Host *shost) {
spin_lock(&shost->host_lock); // <-- Lock acquired
if (scsi_host_busy(shost)) {
wake_up_process(shost->eh_thread); // Wake up error handler if busy
}
spin_unlock(&shost->host_lock); // <-- Lock released
}
// scsi_host_busy iterates over all reqs (N * M), potentially very slow!
Consequence: Every thread finishing a SCSI request triggers scsi_eh_wakeup(), -each time- iterating over all pending requests, always under the host lock. When many requests are active (think: big storage servers), this multiplies the chance of delays and lockups.
Real world
On some configs (e.g., mpi3mr HBA with 128 queues, queue depth 8169), the lock contention became so bad it could trigger a hard lockup: kernel would stall.
The Fix (CVE-2024-26627 Patch)
Solution: Move scsi_host_busy() outside of the host lock, because reading busy count doesn't actually need locking.
Patched sketch
// After the fix
void scsi_eh_wakeup(struct Scsi_Host *shost) {
// Call without holding the lock
if (scsi_host_busy(shost)) {
// Acquire lock only for waking EH thread
spin_lock(&shost->host_lock);
wake_up_process(shost->eh_thread);
spin_unlock(&shost->host_lock);
}
}
Impact:
Exploiting CVE-2024-26627 (in Lab or Testing)
This is more a DoS or reliability bug than a code execution threat; the "exploit" is a way to trigger a kernel hang.
Configure lots of hardware queues (N=128+) with big queue depths (M=800+)
3. Fire off overload of I/O, so that all request tags are used.
Now, induce any error (e.g., hot-unplug a disk, mess with cables).
5. The kernel gets into SCSI recovery… then freezes or stalls as it exhausts CPU in the locked busy-check loop.
The result: The server hard-blocks, might need rebooting.
Note: This is NOT a privilege escalation path, but is a significant service availability problem.
## How to Defend / Detect
- Upgrade your kernel! Make sure you have this patch or newer.
- Use tools like lockup detector or watchdog to catch hardlockups early.
- On affected platforms, monitor dmesg for multiple ‘lockup detected’ messages when stress testing SCSI error recovery.
## References / Further Reading
Original Patch:
scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler
Linux Kernel CVE page:
Linux Kernel SCSI Core Docs:
Final Thoughts
CVE-2024-26627 is a reminder that even kernel-level bugs can show up just from bad locking and scaling decisions—not all bugs are security holes with privilege escalation risks, some can ‘just’ break your system by pushing locks to their limits. If you run big storage servers, keep your kernels updated and watch for stalls during error recovery!
*Exclusive by Linux Explainer. Share and stay up to date!*
Timeline
Published on: 03/06/2024 07:15:12 UTC
Last modified on: 10/31/2024 15:35:30 UTC