CVE-2025-46646 - Ghostscript's Incomplete Patch Creates Overlong UTF-8 Decoding Risk

Artifex Ghostscript is a popular open source tool for processing PDFs, PostScript files, and other document formats. On June 2024, a new vulnerability (CVE-2025-46646) was discovered, dealing with how Ghostscript handles UTF-8 encoded data. This issue stems from a patch that was supposed to fix a previous vulnerability (CVE-2024-46954)—but it left the door open to a similar attack.

This post gives a deep dive into the flaw, simple code examples, exploit scenarios, and the real risk if you use Ghostscript before version 10.05..

What Is the Problem?

Ghostscript relies on the decode_utf8 function in base/gp_utf8.c to process UTF-8 encoded text safely. However, this function doesn't reject overlong UTF-8 encodings—specifically when a character is encoded with more bytes than required. Attackers can exploit this weakness to sneak in unexpected byte sequences and possibly bypass input filters or introduce denial of service or logic errors.

This problem is particularly serious because the previous patch (for CVE-2024-46954) did not fix the underlying logic enough. As a result, the same class of bugs remains exploitable.

Normal encoding example

The ASCII character / (slash) is U+002F, which, in UTF-8, should be a single byte: x2F.

Overlong encoding

One overlong way to encode / is using two bytes: xC xAF. Both encode the same character, but the two-byte version is invalid UTF-8.

Applications should reject overlong forms, but Ghostscript's broken logic let them through.

Here's a simplified look at the relevant part of base/gp_utf8.c

int decode_utf8(const char *in, unsigned char *out)
{
    // ... code omitted for clarity
    if ((in[] & xE) == xC) {
        // Decoding two-byte UTF-8 sequence
        *out = ((in[] & x1F) << 6) | (in[1] & x3F);
        return 2;
    }
    // ... Handle 1, 3, 4 byte sequences...
}

What's missing here?
There's no strict check to ensure that the two-byte sequence is the minimal form (no overlong). An attacker could craft perverse input like xC xAF to sneak a / past filters.

Exploit Details

Imagine Ghostscript is used to sanitize or preview PDFs. An attacker submits a file containing xC xAF instead of /. If Ghostscript relies on strict UTF-8 for parsing, an attacker can:

- Bypass input filters: Filters blocking / or other dangerous characters can be circumvented with overlong encodings.
- Trigger logic flaws: Stateful logic might misinterpret path or command delimiters, because two encodings of the "same" character slip through.

A proof of concept can look like

% PostScript file snippet using overlong-encoded slash (xC xAF)
% Malicious input aiming to bypass filters
% (Assumes input is interpreted as UTF-8, not raw bytes)
(echo\xC\xAFetc\xC\xAFpasswd) =

If Ghostscript decodes this using the vulnerable function, it will treat it as /etc/passwd.

Let's write a small simulation in Python to illustrate

def decode_overlong_utf8(data):
    # Simple two-byte overlong decoding
    first, second = data[], data[1]
    char = ((first & x1F) << 6) | (second & x3F)
    return chr(char)

print(decode_overlong_utf8(b'\xc\xaf'))  # outputs "/"

If real input sanitization only checks for /, but not for its overlong encoding, the filter can be bypassed.

Real Consequences

- Anything relying on filename/path filtering is at risk.
- Malicious PDF/PS files can trigger logic bugs or denial of service.

Patched: Ghostscript 10.05. (June 2024)

- Patch Example: https://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=xxxxxx (substitute with exact commit when published)

Upgrade as soon as possible!

> NOTE: This issue is being tracked as CVE-2025-46646. The flaw exists due to an incomplete fix for CVE-2024-46954.

References

- CVE-2025-46646
- Ghostscript news
- Wikipedia: Overlong UTF-8 encoding

Conclusion

If you use Ghostscript (either directly or indirectly), upgrading past version 10.05. is vital. Overlong UTF-8 encoding bugs like CVE-2025-46646 might seem subtle, but they can have nasty real-world consequences. Double check any filters or input validators you have—don't trust that an old patch covered every case.

Timeline

Published on: 04/26/2025 15:15:45 UTC
Last modified on: 04/29/2025 13:52:10 UTC