1. What is CVE-2021-47038?

CVE-2021-47038 is a concurrency vulnerability in the Linux kernel’s Bluetooth stack. It was caused by a deadlock problem after a change in how the Bluetooth protocol socket options and device locking worked together. This could allow a process or attacker to cause the affected system to hang, especially when working with Bluetooth sockets.

A deadlock like this is a bug that stops the system from working correctly, making services or applications unresponsive, or in severe cases, crashing the system altogether.


2. How Did the Vulnerability Happen?

The root cause lies in how the Linux kernel’s Bluetooth code locked resources. Specifically, starting with commit eab2404ba798 ("Bluetooth: Add BT_PHY socket option"), there was a new relationship between the hci_dev->lock (Bluetooth device lock) and the "socket lock".

The lock sequence went like this

- When using the BT_PHY socket option, code would sometimes hold a *socket lock* and then try to take a *Bluetooth device lock*.

Elsewhere, the reverse could happen: take the *Bluetooth device lock* first, then *socket lock*.

This inconsistency creates a classic deadlock scenario—two resources, two threads, each holding one resource and waiting for the other. Neither can progress!


3. Deadlock Explained: The Chain Reaction

Let’s break down the warning and its implications, using part of the kernel lockdep trace as the kernel developers originally described:

WARNING: possible circular locking dependency detected
...
bluetoothd/1118 is trying to acquire lock:
  &hdev->lock
but task is already holding lock:
  sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP
which lock already depends on the new lock.

Both threads wait on each other forever! That’s a deadlock.

The kernel stack backtrace showed precisely this circular dependency.

In Pseudocode

// Thread A:
lock(socket_lock);
lock(hdev_lock); // blocks if Thread B already holds

// Thread B:
lock(hdev_lock);
lock(socket_lock); // blocks if Thread A already holds

Real Kernel Example (cut for clarity)

bluetoothd/1118 is trying to acquire lock: &hdev->lock
but task is already holding lock: sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP

Stack trace:
 lock_sock_nested
 l2cap_sock_ready_cb
 l2cap_config_rsp
 ...
 hci_conn_get_phy
 l2cap_sock_getsockopt
...

When the function hci_conn_get_phy() was called with the device lock held, it didn’t actually require that lock. But because the code took it anyway, it could lead to a deadlock depending on the order of lock acquisitions.


4. The Exploit Potential

This bug is not a classic "remote code execution" or "privilege escalation", but it can result in high-impact denial of service. An attacker, or even a normal program using the affected Bluetooth APIs in a certain pattern, could lock up part or all of the system.

All that’s required: two or more processes (or threads) operating on Bluetooth sockets and devices in such a way that they trigger the deadlock chain. The system will hang until forcibly rebooted.

Suppose two threads do this

// Thread 1: takes socket lock, then tries BT_PHY option
lock_sock(sk);                // Takes socket lock
getsockopt(sk, BT_PHY, ...);  // Eventually tries to lock hdev->lock

// Thread 2: takes hdev lock, then does socket operation
mutex_lock(&hdev->lock);      // Takes hdev lock
// Initiates operation that takes socket lock (e.g., connect)
l2cap_sock_connect(...);      // Takes socket lock

If timed "just right", both threads wait forever.

5. Patch & Fix

The fix, as explained by the Linux kernel developers (see commit link), is to remove the unnecessary locking in hci_conn_get_phy() and similar calls. This is safe because that function didn’t actually need the device lock in the first place.

Patch Example (simplified)

- mutex_lock(&hdev->lock);
  // function body
- mutex_unlock(&hdev->lock);

See:
- Kernel Patch Reference


6. Takeaways for Developers

- Be careful with lock order: Always acquire shared resources in a consistent order to avoid deadlocks, especially in complex systems like the Linux kernel.
- Minimize locking: Don’t lock if you don’t need to! Every unnecessary lock increases complexity and the risk of bugs.
- Use lockdep and similar tools: Static and run-time lock checking tools can catch complex deadlocks before they reach users.
- Review changes with concurrency in mind: When adding features (like new socket options), always consider how they interact with existing lock hierarchies.


7. References

- CVE-2021-47038 on NVD
- Kernel Bugzilla #212491 (Red Hat)
- Upstream kernel commit: Remove unneeded mutex
- Discussion thread on the Linux Bluetooth mailing list

Summary

CVE-2021-47038 was a potentially serious flaw in the Linux kernel’s Bluetooth stack, causing possible system hangs due to a locking bug. By carefully reviewing locking patterns and removing unnecessary locks, the Linux community quickly fixed the issue. For everyone writing multi-threaded or kernel code today: always watch your locks!


*Copyright 2024. This write-up is original and exclusive for your review.*

Timeline

Published on: 02/28/2024 09:15:39 UTC
Last modified on: 12/06/2024 20:56:10 UTC