---

Summary:
CVE-2021-46964 was a Linux kernel bug in the QLogic qla2xxx SCSI driver. It caused kernel crashes when running on systems with few CPUs, like dual-core machines, due to bad IRQ vector allocation logic. This post explains in simple terms how the bug was introduced, how it manifested (with code examples and stack traces), the underlying risk, and how it was fixed.

What Is qla2xxx and Why Does It Matter?

The qla2xxx kernel module drives QLogic Fibre Channel HBAs in data centers everywhere. A flaw here risks the stability and reliability of enterprise storage.

In high-performance storage, the driver can assign a “queue” (or 'queue pair') per CPU, using MSI-X interrupts to maximize efficiency. Each queue needs its own IRQ vector. This mapping is delicate — especially in systems with just a few CPUs.

The Vulnerability: Fewer IRQs Than Needed (Commit a6dcfe08487e)

A commit meant to *optimize* IRQ allocation (“Limit interrupt vectors to number of CPUs”) accidentally reduced it *too much*.

Before the change:
The driver would ask for ‘enough’ vectors for all CPUs *plus* several reserved interrupts.

After the change:
It allocated vectors equal only to the number of CPUs — not enough to also cover required management interrupts.

Buggy Equation in the Code

ha->max_qpairs = ha->msix_count - 1 /* MB interrupt */
                                - 1 /* Default RSP queue */
                                - 1 /* ATIO, only needed for certain modes */;

That means *ZERO* regular queue pairs.

But:

Whenever a SCSI command is sent, this function is called

if (ha->mqenable) {
    uint32_t tag;
    uint16_t hwq;
    struct qla_qpair *qpair = NULL;

    tag = blk_mq_unique_tag(cmd->request);
    hwq = blk_mq_unique_tag_to_hwq(tag);
    qpair = ha->queue_pair_map[hwq];   // <- CRASH RIGHT HERE

    if (qpair)
        return qla2xxx_mqueuecommand(host, cmd, qpair);
}

With ha->queue_pair_map == NULL, dereferencing it crashes the kernel as a NULL pointer!

Example Real World Stack Trace

BUG: kernel NULL pointer dereference, address: 000000000000000
#PF: supervisor read access in kernel mode
Oops: 000 [#1] SMP PTI
...
RIP: 001:qla2xxx_queuecommand+x16b/x3f [qla2xxx]
Call Trace:
 scsi_queue_rq+x58c/xa60
 blk_mq_dispatch_rq_list+x2b7/x6f
 ...

The Root Cause

- The new “optimized” code didn’t reserve IRQ vectors for management (MB), default response, and ATIO interrupts.
- On low-CPU systems (dual-core, 2 CPUs), you could end up with zero queue pairs, but the driver *would still enable mq (multi-queue operations)*.
- The driver then tries to dereference a map that was never allocated, leading to a NULL pointer dereference.
- Even in slightly larger systems, the setup could be “unbalanced” with fewer hardware queues than CPUs — hurting performance or stability.

How Was It Fixed?

The new logic:

Every CPU (for fast, parallel processing)

- PLUS the reserved interrupts (management/MB, RSP, ATIO)

This keeps the equation valid for all supported CPU counts.

Patch Example (pseudo)

int cpu_count = num_online_cpus();
int reserved = 3; // MB, RSP, (ATIO if needed)
int vectors_needed = cpu_count + reserved;
allocate_msix_vectors(vectors_needed);

Now, ha->queue_pair_map is always allocated if mqenable is active. No more kernel crashes!

Official Patch Commit:
- scsi: qla2xxx: Reserve extra IRQ vectors (commit 5ce49ba3b2e6)
- Original Bug Report / Patch Email

Can It Be Exploited?

Accidentally:
Any admin running an affected kernel, on a dual-core/VM system with QLogic cards, could suddenly see kernel crashes upon issuing SCSI commands.

By a Local Attacker:
A clever malicious user with root could likely trigger the crash reliably with simple SCSI scans/commands, potentially causing a denial-of-service.

But:
No evidence that it was remotely exploitable, nor leading to privilege escalation.

Or, avoid affected kernels if using qla2xxx on small SMP (multi-core) systems

To check your kernel for the bug:

Is your kernel between v5.10-rc1 and v5.15.11?

- And does /lib/modules/$(uname -r)/kernel/drivers/scsi/qla2xxx/ exist?

References & Further Reading

- Kernel.org changelog: scsi: qla2xxx: Reserve extra IRQ vectors
- CVE Entry: CVE-2021-46964
- Patch submission/discussion on lore.kernel.org

Closing Thoughts

Kernel driver bugs like CVE-2021-46964 can cause surprising system crashes, sometimes years after introduction by "optimizing" patches. Thanks to diligent bug reporters and maintainers, this one was fixed before it affected many. But if you run storage VMs or hosts on QLogic hardware, make sure your kernel is up-to-date!


Author’s Note:

This explanation is written for clarity and depth by a Linux security specialist, breaking down *exactly* what led to CVE-2021-46964, why it crashed systems, how it was resolved, and how sysadmins and developers can think about similar problems in the future.

Timeline

Published on: 02/27/2024 19:04:07 UTC
Last modified on: 01/08/2025 16:23:53 UTC