In 2021, a vulnerability was discovered and fixed in the Ceph filesystem implementation for the Linux kernel, tracked as CVE-2021-47000. This bug could cause serious issues on affected systems by leaking kernel resources—in particular, *inodes*—if certain file operations failed. In this article, I'll break down what the bug was, how it worked, how it was fixed, and why it's important for both system administrators and software developers. I'll also provide links to original sources and show you how an attacker or a misconfigured program could trigger the leak.

What Actually Happened?

The vulnerability occurred in the __fh_to_dentry function of the Ceph filesystem driver. This function is used internally to convert a file handle (fh) into a dentry (directory entry) and eventually an inode.

An *inode* is a critical object representing files in the filesystem; losing track of one in memory (an *inode leak*) causes the kernel to hold onto memory it will never reclaim, slowly eating up available resources.

The root cause? Not correctly releasing an inode kernel object when an error happened during a getattr (get attribute) operation inside __fh_to_dentry.

Technical Details

Here's a simplified version of the problematic code (before the fix), based on the commit on kernel.org:

struct dentry *__fh_to_dentry(struct super_block *sb, ... ) {
    struct inode *inode = ceph_get_inode(sb, ...);
    int err = ceph_do_getattr(inode); // This might fail!

    if (err) {
        // PROBLEM: We forgot to release the inode here!
        return ERR_PTR(err);
    }

    // more code...
}

If ceph_do_getattr failed, the function would return early without releasing the inode reference, leaking it.

Why Is This a Problem?

Holding inodes in memory means the system never frees them. Over time, with repeated errors (which could be triggered by something as simple as trying to stat broken or unreachable Ceph files), you'd slowly run your kernel out of memory—potentially leading to a denial of service.

How Was It Fixed?

The solution was to add cleanup code to release the inode reference in all error situations. Here's the *fixed* code snippet:

struct dentry *__fh_to_dentry(struct super_block *sb, ... ) {
    struct inode *inode = ceph_get_inode(sb, ...);
    int err = ceph_do_getattr(inode);
    if (err) {
        iput(inode); // Properly release the inode!
        return ERR_PTR(err);
    }
    // more code...
}

By calling iput(inode), the kernel correctly decreases the reference counter on the inode, freeing it when appropriate.

See the commit fixing the bug:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1da3eb587227f164dae3b845fdf38a36f84878e4

Exploit Details

While this wasn't a classic "remote code execution" or privilege escalation vulnerability, it *could* be exploited to cause a denial of service (DoS) on affected systems.

Repeatedly attempt to access Ceph files that trigger errors in ceph_do_getattr.

- This causes the kernel to create and leak inodes until the system runs out of memory/resources, possibly crashing or becoming unresponsive.

For example, a user or a script could simply

while :; do
    stat /mnt/ceph/broken-or-deleted-file
done

If /mnt/ceph is mounted using the Ceph filesystem, and the target file causes an error in getattr, each loop iteration could leak an inode, and over thousands of iterations, the kernel would eventually choke.

Note: Modern systems have mitigations against this (like cgroups and resource quotas), but not all setups are protected.

Kernel commit (the fix, with diff):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1da3eb587227f164dae3b845fdf38a36f84878e4

CVE page:

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-47000

Ceph kernel source:

https://github.com/torvalds/linux/tree/master/fs/ceph

Any Linux kernel with Ceph filesystem enabled, *prior to the fix's merge in early 2021*.

- Distributions shipping kernels before the fix may be exposed (check your vendor's advisory page or backport status).

Upgrade your kernel to a version with the fix included (see the reference commit above).

- If you must use an old kernel, restrict user access to Ceph-mounts and monitor kernel memory usage for leaks.
- Use security tools that can warn you about resource exhaustion in /proc/meminfo and inodes.

Conclusion

CVE-2021-47000 is a great example of how small mistakes—forgetting to clean up after an error—can cause real-world reliability and security problems. While this bug doesn't let attackers take over the system, it shows that careful resource management is a must for kernel developers. Always check error paths, especially when handling reference-counted objects like inodes.

Stay safe, keep your setups patched, and if you're a kernel hacker—always cleanup, especially on errors!

Timeline

Published on: 02/28/2024 09:15:38 UTC
Last modified on: 11/01/2024 15:35:02 UTC