CVE-2021-46987 uncovers a tricky deadlock scenario in the Linux kernel’s Btrfs filesystem. When you use _qgroups_ (quota groups) and perform a clone operation which copies _inline extents_, the system could freeze up due to lock dependencies. The bug was subtle and only triggered by certain file operations when there was not enough metadata space and quota accounting was in effect.
This post demystifies the issue with simple explanations, shows where and how the problem occurred, and discusses possible real-world exploit potential.
Technical Background
Btrfs is an advanced copy-on-write filesystem used in Linux. It supports features like snapshots, subvolumes, and quotas (qgroups). When a file (or part of a file) is _cloned_, Btrfs can “share” the underlying storage blocks, but sometimes the file data (called _inline extent_) must be actually copied into a page in memory.
Qgroups are quota mechanisms allowing administrators to set and monitor space usage in flexible ways.
If the filesystem is low on space, Btrfs tries to _flush delalloc_ to free up space.
- Flushing delalloc involves writing dirty pages, which require acquiring the _same lock_ already held at step 1.
Qgroups complicate matters
- Quota accounting inside a transaction might also need to flush delalloc, repeating the circular lock dependency.
4. If delalloc can’t proceed because the range is already locked, and the lock can’t be released until the transaction completes (which waits for delalloc), the kernel deadlocks.
The stacktrace reveals how one kernel thread waits for a lock that another thread can’t release
__schedule
schedule
io_schedule
__lock_page
extent_write_cache_pages [btrfs]
...
extent_writepages [btrfs]
...
wb_writeback
...
wb_workfn
...
process_one_work
worker_thread
kthread
ret_from_fork
A Sketch of the Problematic Code
// Pseudocode: clone_inline_extent in Btrfs
lock_extent_range(dest_inode, range);
mark_page_dirty(dest_inode, page);
start_transaction(); // This may trigger a delalloc flush
// If qgroups are active and space is low, this may indirectly
// lock_extent_range(dest_inode, range); again, causing deadlock.
*Actual fix avoids starting a transaction while the lock and dirty page are held, or ensures flushing won’t require acquiring the same lock.*
Proof of Concept
This problem is tough to reliably exploit without filesystem-level control, but it can often be hit using concurrent filesystem stress tools, like fstress.
`sh
btrfs quota enable /mnt/btrfs
for i in {1..10}; do
cp --reflink=always smallfile /mnt/btrfs/copy_$i &
Watch for system freeze, high load, or kernel logs with stacktraces like above.
Important: This does not corrupt data, but will freeze involved I/O operations until you reboot.
Attack Surface: Local users with write access to Btrfs may trigger a denial of service (DoS).
- Disclosure Risk: The bug can freeze user or system processes, potentially causing a soft DoS (filesystem hangs until reboot).
- Privilege Requirement: No privilege escalation, but unprivileged users can lock up their own processes—or, if quotas are globally enabled, even freeze system services using Btrfs in containers or multi-user environments.
Fixed in Kernel
The bug was fixed by Filipe Manana in kernel commit f6e7fc59e1cb in early 2021, then backported to stable kernels.
How it was fixed:
The transaction start mechanism was rethought to ensure that quota accounting does not trigger a flush which requires a lock already held by the current operation. The fix essentially moves or delays certain operations to prevent circular lock dependencies.
`sh
btrfs quota disable /mnt/btrfs
- Avoid near-full metadata usage:
Monitor with:
sh
btrfs filesystem df /mnt/btrfs
`
- Update kernel to a version 5.12+ or with the backport fix included.
---
## References
- Official Fix Commit:
btrfs: fix deadlock when cloning inline extents and using qgroups
- Original Bug Report:
lkml.org email thread (2021)
- Btrfs qgroups documentation:
Btrfs Wiki - Quota
---
Summary:
_CVE-2021-46987 is a local DoS bug in Btrfs triggered by rare circumstances during clone operations with enabled quotas. While not a security issue in the usual sense, it can cause headaches on multi-user systems or servers using quotas. Patch your kernel if you use Btrfs with quotas._
Timeline
Published on: 02/28/2024 09:15:37 UTC
Last modified on: 12/06/2024 15:07:49 UTC