A vulnerability (now fixed) in the Linux kernel's SMC-D code allowed unprivileged users to crash the system by triggering a NULL pointer dereference when dumping SMC-D connections. This post explains how the bug happened, how it could be exploited to cause a denial of service (DoS), and what the patch did to fix it.

What is SMC-D?

SMC-D (Shared Memory Communications over Direct memory access) is a high-performance networking feature, often used in environments where fast, low-latency communication is needed. It's increasingly present in mainframe and datacenter Linux installations.

Vulnerability Summary

ID: CVE-2024-26615
Affected Kernel Versions: Linux 6.7. and possibly earlier
Patched in: Mainline and stable Linux kernel (see upstream commit)

The Bug

A NULL pointer dereference occurs when dumping SMC-D connections during a short time frame where the connection has been partially set up but not fully initialized. Anyone running smcss -D in a loop could trigger the crash under load.

1. Start Stress Test Servers and Clients

smc_run nginx
smc_run wrk -t 16 -c 100 -d 10s -H 'Connection: Close' http://localhost/

Where smc_run is a tool to make processes use SMC sockets.

2. In Another Terminal, Dump SMC-D Connections in a loop

watch -n 1 'smcss -D'

This rapidly queries the kernel for SMC-D connection states.

After a few iterations, the system may hit

BUG: kernel NULL pointer dereference, address: 000000000000003
...
smcss[7204]: segfault at 30 ip 00007f8b9cf55695 sp 00007ffd3a8f194 error 4 in libc.so.6[7f8b9cea400+17e000]

Under The Hood: What Actually Happens?

When an SMC-D connection is being listed (for example, via smcss -D), the kernel code calls __smc_diag_dump(). This function expects that all connection structs (conn) being shown are fully initialized. However, there’s a brief window after the connection is registered to the link group, but before its rmb_desc member is set up. Accessing conn->rmb_desc here means dereferencing a NULL pointer—triggering a kernel panic or Oops.

Code snippet before the fix

// In net/smc/smc_diag.c
static void __smc_diag_dump(struct sock *sk, ...)
{
    ...
    struct smc_connection *conn = ...;
    struct smc_link_group *lgr = conn->lnk;  // Link group assigned
    ...
    // No check for conn->rmb_desc!
    do_something(conn->rmb_desc->some_field);
    ...
}

Why does this happen?

The connection registration and buffer initialization are two separate steps. During high load and rapid connection churn, dumping SMC-D connections might catch a connection right in this intermediate, not-fully-initialized state.

Exploitability

Who can trigger this?
Anyone who can run smcss -D (this is probably unprivileged locally, sometimes remotely via management tools).

Impact:
Local Denial of Service (kernel crash, system reboot if panic on oops is enabled).

Severity:
Medium – DoS is always bad, but this does not provide privilege escalation.

The Patch

Mitigation logic:

Simple patch logic

if (conn->rmb_desc)
    do_something(conn->rmb_desc->some_field);

Full commit (reference):
- SMC: fix illegal rmb_desc access in SMC-D connection dump

Mitigation: Prevent untrusted users from running SMC tools if you can't upgrade.

- Detection: Watch logs for kernel OOPS/panics with BUG: kernel NULL pointer dereference referencing smc_diag.

References

- Linux kernel commit - net/smc: fix illegal rmb_desc access in SMC-D connection dump
- CVE Details entry *(automatically tracked)*
- SMC (Shared Memory Communications) overview
- Syzkaller report *(example)*

Full Stack Trace Example

BUG: kernel NULL pointer dereference, address: 000000000000003
CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G E 6.7.+ #55
RIP: 001:__smc_diag_dump.constprop.+x5e5/x620 [smc_diag]
Call Trace:
  __smc_diag_dump.constprop.+x5e5/x620 [smc_diag]
  smc_diag_dump_proto+xd/xf [smc_diag]
  smc_diag_dump+x26/x60 [smc_diag]
  ...

Conclusion

CVE-2024-26615 is a classic "race-to-crash" local DoS bug—simple, but dangerous for kernel stability. This bug reminds us that even rarely used tools or “corner-case” networking features deserve scrutiny and robust null checks. Upgrade now if you use SMC-D, and stay safe!


*This writeup is exclusive, in easy-to-read language, and summarizes everything you need to know about CVE-2024-26615 and its impact.*

Timeline

Published on: 03/11/2024 18:15:19 UTC
Last modified on: 12/12/2024 15:31:02 UTC