Summary:
A high-impact bug (CVE-2024-26956) in the Linux kernel's nilfs2 filesystem could allow attackers or faulty disks to trigger a kernel crash by exploiting metadata corruption. Let’s break down the vulnerability, review key code snippets, and see how this was patched.
What is NILFS2?
NILFS2 is a log-structured filesystem for Linux, popular for continuous snapshotting and fast writes. Like any complex software, bugs in the filesystem code can have severe implications: crashes, data loss, or denial-of-service.
The Vulnerability: Double Flaw in nilfs_get_block()
CVE-2024-26956 describes two bugs tangled together, causing the Linux Kernel to hit a fatal BUG_ON() condition. The root cause is how nilfs2 handles internal metadata corruption:
Core Problem
- When nilfs2 tries to locate a data block using a corrupted mapping (due to bad/missing metadata), the function nilfs_dat_translate() doesn’t tell the caller properly that things have gone wrong. Instead, the error is masked, and code further up assumes everything is still OK—until the kernel panics.
Detailed Flow
1. BTree/Direct Mapping Block Translation Fails:
Function nilfs_get_block() uses nilfs_dat_translate() to turn a logical block address into a physical one. If metadata is corrupted, nilfs_dat_translate() returns -ENOENT.
2. Upstream Function Misses the Error:
Unfortunately, this -ENOENT isn't always properly handled by nilfs_get_block(). Sometimes, the failure result is passed up and misinterpreted as “block does not exist”, mixing up “missing” with “unreachable/corrupt”.
3. Buffer Not Marked as Mapped:
The result is a buffer that's NOT mapped to a valid disk block, but upper layers of the storage subsystem (like __block_write_begin_int()) try to read from or write to it anyway.
4. submit_bh_wbc() Panic:
When writing, the generic block I/O code (see submit_bh_wbc()) expects that if a buffer is sent for I/O, its mapping info is correct. Instead, it finds the buffer is not mapped and triggers a BUG_ON()—instantly crashing the kernel.
The Exploit Scenario
While there is no “remote code execution” here, any attacker with write access and the ability to corrupt NILFS2 on disk (including privilege escalation or a buggy program/script) could force the system into a fatal crash by persuading the kernel to hit this codepath.
_Server admins beware:_ A user with write access to a NILFS2 partition could make your system fall over.
In fs/nilfs2/inode.c, the problematic function
// ... inside nilfs_get_block()
ret = nilfs_dat_translate(dat, blkoff, &pblocknr);
if (unlikely(ret)) {
// If ENOENT, this logic lets the error bubble up incorrectly!
goto out;
}
After (Patched Version)
The patch changes the return value when disk address translation fails:
// ... inside nilfs_get_block()
ret = nilfs_dat_translate(dat, blkoff, &pblocknr);
if (ret == -ENOENT) {
// Always report corrosion as metadata error!
ret = -EINVAL;
goto out;
}
// ... (original handling continues)
Result:
By changing -ENOENT to -EINVAL, upstream code properly treats this as a genuine file system error, and the error is correctly mapped to -EIO, so the kernel can mount the filesystem read-only or report an error—instead of panicking.
How To Reproduce (Simplified Exploit)
While exploiting this bug generally requires filesystem corruption, here's how a filesystem fuzzer like syzbot found it:
Write valid data to files in the partition
3. Directly corrupt the DAT (disk address translation table) on disk using direct disk writes or with a crafted filesystem image
4. Try to read/write the corrupted file from userspace
Sample dmesg:
kernel BUG at fs/buffer.c:1942!
invalid opcode: 000 [#1] SMP
Process syz-executor (pid: 32696, ti=... task=... )
...
RIP: submit_bh_wbc+x2ae/x3e
## Links / References
- CVE-2024-26956 - NVD Entry
- Original Kernel Patch on lore.kernel.org
- Syzbot bug report (example)
- nilfs2: fix kernel bug at submit_bh_wbc()
Final Thoughts: Why It Matters
Although this bug requires on-disk corruption, mistakes like this are goldmines for fuzzers and a real headache for sysadmins. Correct detection and signaling of disk errors is critical to avoid catastrophic data loss or outages.
If you run Linux servers with NILFS2, patch immediately or make sure you’re running a kernel at or newer than the fix commit (Feb 8, 2024). Always keep backups, and consider tools that can scan and repair metadata regularly.
Stay safe!
For more tech breakdowns and in-depth kernel security, keep following 🚨
Timeline
Published on: 05/01/2024 06:15:11 UTC
Last modified on: 11/04/2024 17:35:12 UTC