A new and subtle bug, now tracked as CVE-2024-57889, was discovered and fixed in the Linux kernel’s pinctrl subsystem, specifically in the driver for MCP23s08 family I/O expanders. This vulnerability revolved around unsafe locking behavior – more precisely, trying to take a mutex while already holding a spinlock during interrupt (IRQ) setup, which resulted in a “sleeping in atomic context” kernel oops.
This exclusive, step-by-step guide explains what went wrong, how it affects Linux systems using MCP23xxx GPIO expanders (like MCP23017), and how the patch resolves the problem. By the end, you’ll know what this bug is, how it can be exploited, and what the fixed code looks like.
Background: What are MCP23xxx Chips and Their Linux Driver?
The MCP23s08 and similar MCP23xxx chips from Microchip are popular general-purpose I/O (GPIO) expanders controlled over I2C or SPI. In Linux, the driver (pinctrl-mcp23s08) allows these chips to be controlled as extra GPIO pins, complete with interrupt (IRQ) handling – that is, letting your Linux board receive pin-change events even when all hardware IRQ lines are in use.
They’re often used in embedded and industrial devices, including touchscreen systems.
The Bug: “Sleeping in Atomic Context” During IRQ Setup
The core problem occurs when the kernel tries to configure IRQ triggering using MCP23xxx’s driver. Internally, the code path is as follows:
This function calls regmap_update_bits_base() to change the hardware register.
- That function tries to acquire a mutex for safe register I/O.
But the problem is – the kernel is already holding a spinlock for IRQ safety when it enters this code! Spinlocks cannot be slept on; mutexes can (they might sleep if unavailable). Sleeping while holding a spinlock crashes the kernel, as it breaks locking rules.
Here’s how the bug shows up in logs
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
in_atomic(): 1, irqs_disabled(): 1, non_block: , ...
preempt_count: 1, expected:
...
Call Trace:
...
__might_resched+x104/x10e
__might_sleep+x3e/x62
mutex_lock+x20/x4c
regmap_lock_mutex+x10/x18
regmap_update_bits_base+x2c/x66
mcp23s08_irq_set_type+x1ae/x1d6
__irq_set_trigger+x56/x172
__setup_irq+x1e6/x646
request_threaded_irq+xb6/x160
...
If you edit your device tree or drivers to use IRQs from a MCP23xxx, this bug may randomly crash your board during boot or when requesting an IRQ.
Internally, the regmap mutex is not needed – and can cause the forbidden sleep-in-spinlock bug.
In effect, the code tried to nest a mutex inside a spinlock, which Linux strictly forbids.
Can this bug be exploited for code execution?
- Not directly. It is a denial of service (DoS) issue: any local user or process able to set up (configure) a GPIO/IRQ line via the MCP23xxx driver could trigger a kernel oops or panic, thus crashing the system.
Typical exploit scenario
- Any kernel code (such as a touchscreen driver) that requests IRQ lines handled by MCP23xxx via the request_threaded_irq() sequence.
- An attacker with kernel module insertion ability (or exploiting a buggy driver) could trigger system instability or downtime by using the MCP23xxx interrupt in certain ways.
What did the patch do?
- It disables the internal regmap mutex, relying only on the driver's own mcp->lock for concurrency protection.
Adds extra locking only where needed (some pin configuration code).
All register accesses are now protected by a single, unified locking mechanism (no more mutex-in-spinlock deadlock).
This regmap_config causes forbidden locking
static struct regmap_config mcp23s08_regmap = {
// ...
//.disable_locking = 1, // << NOT SET! Bad: mutex will be used by regmap
};
Set disable_locking, and add explicit locking in special cases
static struct regmap_config mcp23s08_regmap = {
// ...
.disable_locking = 1, // Fix: disables regmap’s internal mutex
};
And add in mcp_pinconf_get/set:
mutex_lock(&mcp->lock);
/* regmap_read/write logic */
mutex_unlock(&mcp->lock);
Now, mcp->lock always controls access to MCP23xxx, and regmap does not try to lock a mutex within atomic context.
References and Upstream Patch
- Official Linux kernel commit on kernel.org
- CVE-2024-57889 entry at cve.org
- LKML Patch Discussion *(example link, check real ones for updates)*
Upgrade to a kernel with this patch (check the references for backports to LTS versions).
- This bug cannot leak data or escalate privilege *directly*, but it can crash your device, interrupting service.
- If affected and unable to upgrade, avoid configuring IRQs via these GPIO expanders until you can patch.
CVE-2024-57889 demonstrates how subtle kernel locking bugs can become system stability issues, even in rarely-tread hardware support code.
*Have questions or running into the bug yourself? Comment below or share your scenario!*
Timeline
Published on: 01/15/2025 13:15:13 UTC
Last modified on: 05/04/2025 10:05:57 UTC