The Linux kernel is the foundation for countless operating systems, both in servers and phones all around the world. Security vulnerabilities in the kernel are big news — they can have deep-reaching consequences. One such critical bug has been assigned CVE-2023-3611. Let's break down what this vulnerability is, how it works, and why you should take it seriously.

What is sch_qfq?

Before we dive into the vulnerability, it's good to know a bit about the component involved: sch_qfq.

In Linux, network traffic can be shaped and scheduled using various queue disciplines ("qdiscs"), which control how packets are sent out by the network stack. sch_qfq stands for *Quick Fair Queueing*, a qdisc that aims to share bandwidth fairly and efficiently between multiple traffic flows.

The Root of the Problem: qfq_change_agg()

The bug lies in the function qfq_change_agg() within the kernel’s sch_qfq code (net/sched/sch_qfq.c). It's related to how the kernel manages bandwidth and size limits for queued packets.

Due to improper bounds checking, a local attacker can trigger an out-of-bounds write by manipulating certain parameters. Specifically, the lmax variable — which tracks packet size limits per traffic class — can be written with values based on user-controlled input, and there's no proper verification to ensure these updates stay within safe boundaries.

In simple words:  
If you can create and modify qfq scheduling rules (typically requires privileges but in containers or misconfigured systems, not always), you can coax the kernel into writing data where it shouldn't, potentially leading to arbitrary code execution in the kernel — the highest-privilege context on the system.

Here’s a simplified snippet of the problematic code (kernel 6.3)

static void qfq_change_agg(struct Qdisc *sch, unsigned int class, struct sk_buff *skb)
{
    struct qfq_sched *q = qdisc_priv(sch);
    u32 len = skb->len;

    if (q->lmax[class] < len)
        q->lmax[class] = len; // No bounds check on 'class'
}

The key here is that class comes from user-controlled input, and there’s no check to make sure it is less than the size of lmax array. This is an out-of-bounds write.

Who's at Risk?

- Any user/process with the right to configure traffic control (tc) qdiscs with the sch_qfq discipline.
- Shared/virtualized environments, containers, and any workloads where users can manipulate traffic control rules.

Exploit Path: Local Privilege Escalation

A local attacker (one with access to create/manage qfq qdiscs, which is often possible in containers or by default on some systems) can:

Carefully manipulate memory to execute arbitrary code in the kernel context or crash the system.

By chaining these steps and using advanced heap spraying or data structure manipulation, a full local root exploit is plausible.

Proof-of-Concept Code

Here's a conceptual demonstration in C using Netlink sockets to interact with the kernel's qdisc interface — note that the real exploit would be more involved and dangerous.

// Note: This is simplified and not a functional exploit!
#include <linux/pkt_sched.h>
#include <linux/rtnetlink.h>
#include <sys/socket.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>

// You need CAP_NET_ADMIN capability for this!
int main() {
    int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    struct {
        struct nlmsghdr nlh;
        struct tcmsg tcm;
        char buf[256];
    } req;

    memset(&req, , sizeof(req));
    req.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
    req.nlh.nlmsg_type = RTM_NEWQDISC;
    req.nlh.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
    req.tcm.tcm_family = AF_UNSPEC;
    req.tcm.tcm_ifindex = /* index of your network interface */;
    req.tcm.tcm_handle = TC_H_MAKE(1, /* oversized class index */ x100);

    // Here, we try to assign a huge class index > lmax[], triggering OOB write
    // Would also need to attach a qfq qdisc and a large packet in a real exploit

    send(sock, &req, req.nlh.nlmsg_len, );

    close(sock);
    return ;
}

This just shows that class handles are user-controlled and can reach out-of-bounds. Real exploitation is more involved and should not be attempted on production or anyone else's machine!

How Was It Fixed?

The official fix (commit 3e337087c3b5805feb8a46ba622a962880b5d64) adds proper bounds checks for the class index before writing to lmax.

In the patch, before writing to lmax[class], the code checks if class &lt; QFQ_MAX_CLASSES and avoids any write if out of range.

Upgrade Your Kernel ASAP

Patch applied in stable releases: 6.3.13+, 6.4.4+, and in all supported kernel branches as of July 2023.  
  List of patched kernels

Disable sch_qfq if not used

If your systems don’t need qfq, consider blacklisting the module or restricting access to untrusted users.

Restrict CAP_NET_ADMIN Capabilities

Avoid giving this capability to untrusted containers/users.

References

- NVD CVE-2023-3611
- kernel.org commit 3e337087c3b5805feb8a46ba622a962880b5d64
- Original Patch Mail
- LWN article on CVE-2023-3611

Conclusion

CVE-2023-3611 is a classic example of how a small oversight in kernel bounds checking can lead to serious security flaws. With the wide use of Linux in cloud, server, and device contexts, it’s vital to promptly patch your systems and review privilege policies regarding low-level network operations.

If you maintain Linux systems, update your kernel now. If you manage networks and containers, don’t let users or workloads have unnecessary CAP_NET_ADMIN or interact with qfq unless absolutely needed.

Security starts with careful code — and quick patching when mistakes happen.

Timeline

Published on: 07/21/2023 21:15:00 UTC
Last modified on: 08/19/2023 18:17:00 UTC