CVE-2024-21626 - How a runc File Descriptor Leak Lets Attackers Escape Containers and Access the Host

If you work with containers on Linux, you probably use or have heard of runc—the lightweight CLI tool that spawns and runs containers. Used everywhere from Docker to Kubernetes, it’s at the heart of how containers actually get created on a Linux box. In early 2024, a serious vulnerability—CVE-2024-21626—was discovered in runc that lets malicious attackers burst out of their container sandbox and poke around on the host. This post breaks down what happened, how the attack works, and what you can do about it.

> 🔗 CVE Record: CVE-2024-21626
> 🔗 runc Security Advisory on GitHub

What is runc?

Just to recap, runc is a command-line tool that acts as the low-level runtime for containers on Linux. When you docker run, after some higher-level shuffling, Docker calls into runc to actually create your isolated environment as per the OCI specification. If runc has a bug, that vulnerability may spill over to any container platform built with it.

The Bug: Leaked File Descriptors

The key issue with CVE-2024-21626 is an "internal file descriptor leak." On Unix-like systems, everything—files, directories, sockets—is a file descriptor (FD). If a process opens a file or directory and doesn’t close it, and then forks (creates a child process), the child gets access to those same FDs.

runc failed to close certain directory file descriptors before running container workloads. If an attacker controlled the container image or had a process running inside a container with exec privileges, they could take advantage of these remaining FDs.

Attack Scenarios

There are three main attacks possible due to this bug.

Attack 1: Container Escape with Malicious Image (runc run)

If a container was created from a malicious image, the startup process could scan its open file descriptors and find one that points to a host directory (like /). It could then fchdir() into that directory and access (or overwrite) host files. This lets the attacker go from to host access.

Attack 2: Escape During runc exec

Suppose you already have a running (possibly benign) container. If you run a command with runc exec into it, and the attacker can control that invoked process, the leaked FD in the child process again gives access to the host's filesystem. This allows privilege escalation *after* container creation.

Attack 3: Arbitrary Host File Overwrite

By abusing these descriptors, especially when binaries are writable, an attacker can perform semi-arbitrary host binary overwrites ("attack 3a" and "attack 3b"). This can fundamentally break container isolation and lead to persistent compromise.

Here’s how a basic exploit works, step by step

1. Enumerate FDs: The attacker's code looks through /proc/self/fd for open FDs.
2. Identify Host FD: It checks which FDs point to directories (like /) outside the container.

Change Directory: It uses fchdir(fd) to change its working directory to the host's root.

4. Access Host Files: It opens or overwrites sensitive files on the host like /etc/shadow.

PoC: Exploiting Leaked Host Root Descriptor

Here's an easy-to-understand Python snippet that attacks this vector. It should run inside a compromised container:

import os

for fd in range(3, 100):  # Skip stdin, stdout, stderr
    try:
        target = os.readlink(f"/proc/self/fd/{fd}")
        # Can we access /etc/shadow on the host from this fd?
        os.fchdir(fd)
        with open("etc/shadow", "r") as f:
            print(f.read())
            break  # Stop if we succeed
    except Exception as e:
        pass  # Not the fd we're after, keep looking

Real-World Impact

- Cloud Workloads: Any multi-tenant platform using runc <= 1.1.11 could have containers escaping onto the host.
- CI/CD Systems: Malicious builds could gain persistent host access, steal secrets, or plant backdoors.
- Kubernetes: If nodes let pods run as root, or with certain privileges, this attack could be weaponized.

How is It Fixed?

runc 1.1.12 release (security patch) addresses this by:

- Ensuring all file descriptors not explicitly needed by the child process are closed before executing workloads.

Hardened checks around what gets inherited in the child process.

Patch diff:
- GitHub diff: fix FD leak

To check your runc version

runc --version
# Should say 1.1.12 or higher

More Info

- runc Security Guide
- OCI runc Changelog
- runc bug bounty writeup

Conclusion

CVE-2024-21626 is a harsh reminder that even foundational container tooling like runc can have severe bugs. Leaked file descriptors opened the door for attackers to move from "just another container process" to full host access. The fix is out—patch now—and always be wary of what code and images you run on your infrastructure.

Timeline

Published on: 01/31/2024 22:15:53 UTC
Last modified on: 02/11/2024 06:15:11 UTC