In mid-2021, a subtle yet critical vulnerability was discovered and resolved in the Linux kernel’s Btrfs (B-tree File System) implementation. Identified as CVE-2021-47072, this bug allowed removed *dentries* — directory entries — to persist even after filesystem logs had been synced. In the event of a system crash or power loss, this could leave the filesystem in a state where deleted or moved directory entries would still appear after recovery. For those running mission-critical workloads or storing sensitive data, this flaw posed a silent data consistency risk.

Let’s walk through what went wrong, how the Btrfs code handled directory operations, exploit insights, and why the fix matters.

What’s a “Dentry” and Why Does It Matter?

A dentry (“directory entry”) is how Linux’s filesystems (including Btrfs) keep track of which files and folders exist in a directory and where their metadata resides. When you *move* a file or subdirectory from one location to another, Btrfs is responsible for updating all the pointers and records (dentries) to reflect the move and ensure the old location is truly empty.

This process is especially important when the filesystem *logs* recent changes for crash consistency.

Scenario: Moving and Logging Inodes

Suppose you have a directory (testdir) and a subdirectory (dira). You create thousands of files to spread the directory’s contents over multiple Btrfs leaves (internal storage blocks). Next, you move dira from testdir to another location. Logically, testdir/dira should disappear, and its new path should appear after an fsync and power cycle.

But with CVE-2021-47072, if you fsync'ed the parent directory after certain types of changes (like chmod), the log wasn't recording the range of keys that include the removed dentry. As a result, after a crash, *both* the old and new dentries could be restored—leaving you with a “ghost” directory entry. This violated Btrfs’s consistency model and could cause user-space confusion.

Demonstrating the Bug

The core of this bug can be demonstrated with carefully controlled steps, involving permissions changes, syncing, and forced outages. Here’s a high-level reproducible sequence (fake device as example):

# Set up a Btrfs filesystem with a large node size for more controllable leaves:
mkfs.btrfs -f -n 65536 /dev/sdc
mount /dev/sdc /mnt

# Create a test directory and populate with many files:
mkdir /mnt/testdir
chmod 755 /mnt/testdir
for i in {1..120}; do touch /mnt/testdir/file$i; done

# Create a subdirectory that's thoroughly indexed in leaves:
mkdir /mnt/testdir/dira

# Persist everything to disk:
sync

# Change permissions on the parent directory, which changes only its 'inode' item:
chmod 700 /mnt/testdir

# fsync parent directory to log its changes:
fsync /mnt/testdir

# Simulate power loss here (e.g., unmount filesystem forcibly or crash VM)
# After reboot, remount and observe that /mnt/testdir still "remembers" dira, even though it should not exist

In this scenario, a crash between the fsync and actual on-disk commit would make old directory entries resurface upon recovery, exposing a deleted or moved item. This is not supposed to happen: only the new parent should reference the moved directory.

How the Bug Occurred in Btrfs Code

The Btrfs log-tree code tries to be efficient and *only logs recent changes*. However, its logic for "authoritative ranges" — the set of key types and offsets for which the log is considered to have the true state — was incomplete. In this case, the old directory parent was not marked as authoritative for the key range including the just-removed dentry.

As a result, when replaying the log after a crash, Btrfs could believe that both the old and new parents referenced the entry, when only the new one should have.

Relevant kernel pseudocode

// A simplified illustration of the logic
if (!range_contains_removed_dentry) {
    // log does not cover the full key range
    // ...removed dentry not logged as deleted...
} else {
    // log covers the range, everything is replayed properly
}

Security and Data Impacts

- Integrity Failure: The main risk was stale directory entries that should not exist after certain operations and a crash.
- Confusion in Apps: Applications expecting atomic moves could see the same file or directory in both old and new locations.
- Potential Data Exposure: In multi-user setups, deleted or moved items unexpectedly persisting can be a data leakage vector.

The Fix

The fix (kernel commit b20dca073d, merged in Linux 5.13) adjusted the authoritative logging range logic, making sure that removed dentries were always logged as such, regardless of which leaves were affected by the original change.

Snippet from the patch

// Ensure we log the removed dentry even if it’s not in the COW-ed leaves
if (dentry_was_removed && !range_was_logged) {
    /* Explicitly log it as deleted */
    btrfs_log_del_dentry(trans, ...);
}

The new logic ensures that during log replay, only the correct (new) parent possesses the dentry, as it should.

References

- CVE-2021-47072 Details (MITRE)
- Btrfs Fix Patchset & Commit Discussion
- Linux kernel source: Btrfs changelog
- Btrfs Wiki: Logging and Crash Recovery

Exploit Possibility

While this vulnerability is primarily a data integrity flaw — not exploitable for raw code execution or privilege escalation — a determined attacker could leverage it on a multi-user system. Suppose a user deletes a sensitive directory, it moves to a new location, and a crash happens. Due to this bug, the old (supposedly deleted) directory could reappear on boot, exposing its contents unexpectedly.

In practical use, it’s a reliability and privacy issue: your filesystems may not be as consistent or secure as you thought if you’re running a vulnerable kernel.

How to Stay Safe

- Upgrade: Btrfs users should upgrade their kernel to 5.13 or newer, or apply backports of the fix if using an LTS or distro kernel.
- Check Data Consistency: If you suspect past exposure to improper log replays, consider scanning for duplicate or “ghost” directories after unexpected shutdowns.
- Fsync Carefully: Critical applications should use fsync() or sync() after important moves/deletions anyway, but this bug proves even “safe” patterns can need closer inspection.

Conclusion

CVE-2021-47072 is a great example of how complex filesystems like Btrfs can have nuanced bugs with real-world impact. In everyday use, most users would never notice—until something vital silently goes wrong after a crash. Thanks to careful bug reports and diligent kernel maintainers, this flaw has been fixed. But it’s a reminder: in filesystems, even small consistency holes can become big problems over time.


*Stay tuned for more deep dives on real-world Linux and OSS vulnerabilities!*

Timeline

Published on: 03/01/2024 22:15:47 UTC
Last modified on: 01/09/2025 19:42:34 UTC