CVE-2025-22014 - Deadlock in Linux Kernel QCOM PDR Subsystem (Full Analysis & Exploit Details)

In early 2025, the security community identified a subtle but significant concurrency bug in the Qualcomm Peripheral Device Request (PDR) driver, built into the Linux kernel. This bug could cause a deadlock condition in the soc: qcom: pdr subsystem, leading to service lookup failures and potential denial of service (DoS). The issue is tracked as CVE-2025-22014.

In this post, we’ll break down the vulnerability, illustrate how it can be triggered, provide code snippets, and link to relevant references. The aim is to explain the issue in a clear and accessible way for engineers and sysadmins.

What is the PDR Subsystem?

The QCOM PDR (Peripheral Device Request) mechanism is used in Qualcomm-based Linux systems (common on Android phones and IoT devices) for managing and tracking service domains—basically, pieces of firmware or hardware logical partitions. Communication is typically mediated via QMI (Qualcomm MSM Interface) messages and various Linux kernel threads/queues.

Process A: Calls pdr_add_lookup() to register interest in a service.

- Process B: Receives a new server notification, calls pdr_locator_new_server(), sets locator_init_complete = true, and updates shared structures.

Both processes are using the same ordered workqueue (qmi->wq) and might try to acquire the same mutex (pdr->list_lock). If process A waits inside a worker while process B needs the same lock for its own worker (which is blocked due to queue ordering), you get a classic deadlock.

Sequence Diagram

Process A                   Process B

pdr_add_lookup()
  | 
  |-> schedule locator work
                 |
                [Blocked]
                 |
        qmi_data_ready_work()
                  |
        pdr_locator_new_server()
        pdr->locator_init_complete = true;
                  |
        [needs list_lock -- BLOCKED by Process A]

When this bug hits, your dmesg or kernel log might show

PDR: tms/servreg get domain list txn wait failed: -110
PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110

Part 2: Code Snippet — The Fix

The deadlock was caused by unnecessary iteration over the domain list inside pdr_add_lookup() just after scheduling work. This caused redundant locking. The fix is to remove this iteration and let the actual work happen entirely within the workqueue's scheduled handler.

Old Code (Vulnerable)

mutex_lock(&pdr->list_lock);
list_for_each_entry_safe(..., &pdr->domain_list, list) {
    // Iterate for some check (redundant)
}
mutex_unlock(&pdr->list_lock);

Patched Code

// Don't grab list_lock or iterate here.
// Only schedule the work, which will lock and operate.
schedule_work(&pdr->locator_work);

Patch Reference:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=50af3e34d1e3

Impact

- Denial of Service: Any client waiting for service location (such as sensor or audio subsystems) may experience unexpected timeouts or failures.
- Regression/Audio Failure: As mentioned, with kernel’s PD-mapper, this makes audio breakage easier to hit.

How to Trigger (PoC)

A malicious or buggy userland program could repeatedly spawn requests to locate services while also sending crafted QMI packets to trigger the new server flow, increasing probability of race/deadlock.

// Pseudocode
for (int i = ; i < 100; ++i) {
    async_pdr_add_lookup("msm/adsp/sensor_pd", ...);
    // Simulate QMI packet that triggers locator up
    send_qmi_new_server_packet();
}

Run this in parallel from two processes. If kernel is unpatched, you may see the -110 timeout.

How to Fix

- Upgrade your kernel to a version including or after mainline commit 50af3e34d1e3
- If unable to upgrade: Disable offending services or avoid simultaneous lookups that can trigger the race (not always feasible).

Confirming the Patch

Check that the change mentioned above is in your kernel tree (pdr_add_lookup no longer iterates over the domain list).

References

- Upstream Patch
- Linux kernel QCOM PDR source tree
- CVE Page (NVD) *(pending listing)*

Conclusion

CVE-2025-22014 highlights how subtle queueing and locking bugs in system software can have critical device-wide impacts. If you run Linux on Qualcomm devices, make sure you’re running the patched kernel. For developers, always audit worker/lock interplay—especially with hardware-specific code!

If you want to test your devices or need mitigation strategies tailored to your deployment, consult with your kernel vendor or security team.

*Thanks to Bjorn and Johan for catching this and helping upstream the fix!*

Timeline

Published on: 04/08/2025 09:15:25 UTC
Last modified on: 04/10/2025 13:15:50 UTC