CVE-2024-46691 - Unsafe Unregistration in Linux Kernel’s UCSI Subsystem Exposes Systems to Potential Kernel Panic

The Linux kernel is the backbone for millions of devices worldwide. Vulnerabilities in its core can lead to wide-reaching security and stability concerns. One such recent example, CVE-2024-46691, highlights the complexities of modern hardware interactions—in this case, the Universal Serial Bus Type-C Connector System Software Interface (UCSI) interacting with Qualcomm’s PMIC GLINK subsystem.

In this long-read, we’ll break down CVE-2024-46691, exploring what triggered the vulnerability, how it was fixed, detailed exploit paths, and what this means for users and device manufacturers. We'll include code snippets, easy-to-understand explanations, exploit details, and links to the original patches—making this an exclusive and accessible resource for administrators and researchers alike.

Introduction

UCSI is how your Linux system talks to USB Type-C ports at the hardware level. The "GLINK" referenced here is part of Qualcomm chips’ inter-processor communication mechanism. Due to some complex interactions and timing assumptions—particularly when hardware events hit during software spinlocks—a bug was introduced that could lead to a kernel NULL pointer dereference.

This all started with a patch to pmic_glink (commit 9329933699b3)—which, in the interest of safety, tucked certain list operations under a spinlock. However, this inadvertently caused ucsi_unregister() to run in an atomic (non-sleeping) context, which isn’t safe because it expects to be able to sleep.

Patch Changes List Locking Behavior:

The commit moved pmic_glink’s client list access under a spinlock because it's hit by low-level IRQ-context communication callbacks.

Unexpected Atomic Execution:

With this change, ucsi_unregister()—which sometimes sleeps (i.e., waits for resources to be released)—could now be called from atomic context, where sleeping isn’t allowed.

3. Race Condition, Then Double Free/Use-After-Free Risk:

If code sleeps in atomic context, you get a kernel panic or hang.

- If code is deferred, unregistering could occur *after* the remote processor shuts down, breaking the communication channel and risking NULL pointer dereference.

Example Scenario: How the Bug Triggers

Imagine a Type-C device disconnecting from your phone or laptop. The system fires up callbacks from the hardware—these can fire in interrupt context, leading straight to ucsi_unregister(). But what if the device is halfway gone by the time ucsi_unregister() tries to talk to it?

Result: A possible NULL pointer dereference in calls like pmic_glink_send().

This could, depending on timing, crash your device or flood your kernel logs with errors like

ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.: failed to send UCSI write request: -5

Exploit Details

While this isn't a remote code execution bug, it is a classic denial-of-service scenario: a user (or a malicious peripheral) can repeatedly plug and unplug a Type-C device, triggering the vulnerable sequence. The result? System instability, kernel panics, or loss of Type-C (including charging) functionality until reboot.

Connect a Type-C peripheral that triggers UCSI subsystem initialization.

2. Disconnect the peripheral at the precise moment (often by automating connection/disconnection) to hit the window where ucsi_unregister() is invoked from IRQ context.

For researchers, a stress/fuzzing test could do this

import os
import time

while True:
    os.system("echo 1 > /sys/class/typec/port/attach") # simulate connect
    time.sleep(.05)
    os.system("echo  > /sys/class/typec/port/attach") # simulate disconnect
    time.sleep(.05)

> _Note:_ This is a simplified pseudocode. Real privileges and hardware access may be necessary, but it gives an idea how timing attacks work here.

The Fix: Deferring Unregistration and Sanity Checks

The fix landed in Linux 6.10-rc1. Now, instead of unregistering immediately (and possibly sleeping in atomic context), the kernel *schedules* the unregistration to a later, safe, sleepable context via workqueues.

Here’s the essence of the fix (commit link):

// Old behavior, wrong context
ucsi_unregister(ucsi);

// New behavior: defer with workqueue
schedule_work(&ucsi->unregister_work);

Additionally, pmic_glink_send() (which talks to the remote processor) now defensively checks for a NULL link and bails early:

int pmic_glink_send(...)
{
    if (!pmic_glink)  // Communication channel gone!
        return -ENODEV;
    // ...rest of function
}

If this happens, the kernel logs the clean error above rather than dereferencing memory and potentially crashing.

References and Further Reading

- Linux Kernel Patch: ucsi: Move unregister out of atomic section
- Original Soc: qcom: pmic_glink commit
- CVE Record

Conclusion

CVE-2024-46691 teaches us valuable lessons about how subtle changes in context (atomic vs sleepable) can unleash a surprising range of kernel bugs—especially when interacting with modern hardware that generates interrupts independent of software logic.

If you maintain or ship Linux devices with UCSI/PMIC-GLINK hardware:

Patch ASAP: This fix shipped in 6.10-rc1 and is being backported to stable trees.

- Audit similar handlers: Watch for code sleep expectations within IRQ or atomic contexts elsewhere!

Stay tuned to kernel.org and the linux-usb mailing list for timely updates.


EXCLUSIVE: This CVE is a prime example of how kernel development, hardware quirks, and security intersect. Stay vigilant!

Timeline

Published on: 09/13/2024 06:15:13 UTC
Last modified on: 09/15/2024 17:57:45 UTC