CVE-2024-23496 - Unpacking a Heap Buffer Overflow in GGUF’s `gguf_fread_str` Functionality (llama.cpp commit 18c2e17)

In early 2024, a security vulnerability (CVE-2024-23496) was found in the llama.cpp project, specifically in its GGUF library’s gguf_fread_str function. This vulnerability is a heap-based buffer overflow—an issue that can allow an attacker to execute arbitrary code if a specially crafted .gguf file is opened. Let’s walk through what this means, how the bug works, sample code, and what you can do to protect your system.

llama.cpp is an open-source project for running Llama AI models on CPUs.

- GGUF is a file format/library used in llama.cpp to handle model files.

If you use llama.cpp to load .gguf files, your application could be at risk if a malicious file is opened.

Function: gguf_fread_str

- Commit: 18c2e17

What’s the Problem?

The function gguf_fread_str is supposed to read a string from a file (a .gguf model file) into a memory buffer. The issue is, it doesn’t properly check the string length before allocating memory or copying the data. If the file signals a very large value, this could lead to:

Below is a simplified, commented version based on the vulnerable commit

// Example based on llama.cpp commit 18c2e17
char* gguf_fread_str(FILE* file) {
    uint64_t length;
    fread(&length, sizeof(length), 1, file); // Reads string length from file

    char* str = (char*)malloc(length + 1);   // Allocates length+1 bytes
    fread(str, length, 1, file);             // Reads data into buffer
    str[length] = ;                         // Null terminator

    // ... use str ...
    return str;
}

What’s wrong here?

- No checks if length is too large (e.g., greater than INT_MAX, or close to SIZE_MAX, or practical file size).

malloc(length + 1) can return NULL, or wrap around (if length is huge).

- fread(str, length, 1, file) can write outside the buffer if length is corrupt/malicious.

Craft a Malicious File:

The attacker creates a .gguf file where the string length (first 8 bytes) is set to a very large number, without actually including that much data in the file.

Trigger the Vulnerability:

When llama.cpp tries to read the file, it allocates a small/insufficient buffer (or possibly NULL), then instructs fread to fill way more data than what the buffer can hold.

Code Execution:

If the attacker controls data past the buffer, it may be possible to execute arbitrary attacker-controlled code (for example, by corrupting a data structure used later).

Below is a simple (and educational) way to craft such a file in Python

# create_bad_gguf.py
with open("bad.gguf", "wb") as f:
    f.write((2**32).to_bytes(8, 'little'))  # Length = 4GB!
    # No actual 4GB data, just add a few bytes of junk.
    f.write(b"A" * 16)

Opening this file with a vulnerable llama.cpp build can cause a heap overflow.

Original References and Further Reading

- GitHub Commit 18c2e17 (vulnerable code)
- HackerOne report (if available)
- Common Weakness Enumeration: CWE-122 (Heap-based Buffer Overflow)
- Official CVE Record (CVE-2024-23496)

Update llama.cpp:

Always use the latest version from GitHub, as patches are likely to be issued promptly.

A safe implementation would add proper checks on length before allocation and reading

#define MAX_STR_LEN x10000 // 64K, as an example

uint64_t length;
fread(&length, sizeof(length), 1, file);
if (length > MAX_STR_LEN) {
    // Too long! Bail out.
    return NULL;
}

char* str = malloc(length + 1);
if (!str) return NULL;
fread(str, length, 1, file);
str[length] = ;  // Null terminate

Conclusion

Heap-based buffer overflows are a classic and dangerous bug. In the case of CVE-2024-23496, a single buggy file could let an attacker run code on your machine. Always be careful with files you open and keep your libraries updated. If you develop C code that processes untrusted files, never trust what the file says—always validate lengths and buffer sizes.

Want more details or see if a patch exists? Check the llama.cpp repo and the CVE record.

Timeline

Published on: 02/26/2024 16:27:56 UTC
Last modified on: 02/26/2024 18:15:07 UTC