CVE-2021-46950 - **Critical RAID1 Data Corruption Bug in Linux Kernel
In early 2021, a serious vulnerability was found in the Linux kernel’s RAID1 (Mirrored Disk) code, tracked as CVE-2021-46950. This bug could cause silent data corruption in systems using software RAID1 arrays with bitmap support. Even with RAID1’s high reliability reputation, this flaw exposes users to unexpected data loss. In this post, we'll break down what went wrong, explore the fix, show code before and after, and explain how you can protect your data.
Background:
Linux’s RAID1 (“mirroring”) stores identical copies of data across two or more drives. “Bitmaps” are a feature used to track which data blocks need synchronizing, especially after a system crash or brief disk failure—it makes rebuilds much faster and safer.
The Problem:
When a write operation fails on a RAID1 array but the RAID software’s bitmap is updated as if the write was successful. This means Linux may think a failed write actually made it to disk, which is the exact opposite of RAID1’s goal.
- The bug lives in the logic for handling the end of a failed write in the md/raid1.c subsystem.
- If a write fails, rather than marking the bitmap as “needs re-write/sync,” it was clearing the bitmap bit, saying the disk block was up-to-date.
Let’s look at the buggy logic (simplified)
// From drivers/md/raid1.c, older version
if (test_bit(R1BIO_Fail, &r1_bio->state)) {
bitmap_endwrite(conf->mddev->bitmap, r1_bio->sector,
r1_bio->sectors, 1, 1);
}
What’s wrong?
3. The Patch: Correct Failure Signaling
The fix ensures the bitmap is only cleared if the write fully succeeded. If it failed, we keep the “needs syncing” marker.
// From drivers/md/raid1.c, after patch
if (!test_bit(R1BIO_Fail, &r1_bio->state)) {
bitmap_endwrite(conf->mddev->bitmap, r1_bio->sector,
r1_bio->sectors, 1, 1);
}
// else: Don't clear the bit if the write failed!
Result:
- If a write fails, the bitmap stays “dirty.” Linux will retry or resync that block later, keeping your data safe.
Patch reference:
- patchwork.kernel.org commit
- git.kernel.org commit
The exploit is data loss—way more damaging in some cases!
- *Scenario*: If a disk silently fails a write (hardware hiccup, cable unplug), and you later rebuild the array or recover a server, you may get wrong data from a supposedly “healthy” RAID1 set.
`shell
mdadm --create /dev/md --level=1 --raid-devices=2 /dev/sdX /dev/sdY --bitmap=internal
Update your Linux kernel
- Distributions patched this bug in early 2021/2022.
`
- Look for the fixed commit: 53df32b6274a9c9a297d39358e79d51d86216f67
`shell
mdadm --misc --detail /dev/md
mdadm --bitmap-file /path/to/file --examine
6. References & Further Reading
- CVE-2021-46950 at NVD
- Linux kernel bugzilla #212005
- Patch diff on lore.kernel.org
Conclusion
RAID is for reliability, but no software is perfect. CVE-2021-46950 is a reminder that even “safe” technologies need regular auditing and updates. If you run Linux software RAID1 and haven’t updated your kernel in recent years, check your kernel version *now* and patch if needed—your data’s safety depends on it.
*If you have questions or need help checking your system, leave a comment below!*
Timeline
Published on: 02/27/2024 19:04:06 UTC
Last modified on: 04/10/2024 20:13:16 UTC