CVE-2023-3823 - The Hidden Risks in PHP’s XML Functions – How Leaky Global State Led to File Disclosure

In mid-2023, security researchers uncovered a subtle yet severe vulnerability in PHP, affecting versions 8. (before 8..30), 8.1 (before 8.1.22), and 8.2 (before 8.2.8). Identified as CVE-2023-3823, this bug revolves around how PHP handles XML parsing – specifically, the unsafe use of global state in its underlying XML library, libxml. When used together with other software that also leans on libxml (like ImageMagick), attackers can exploit this flaw to read sensitive files from the server. This article will help you understand the issue, see an example of how it can be exploited, and learn how to protect yourself.

What is Global State and Why Does it Matter Here?

Libxml is a popular C library that handles XML parsing. To change how XML gets read, libxml uses settings stored in global variables—the same variables for every piece of code using the library in the same process.

PHP’s XML functions (like simplexml_load_string() and DOMDocument::loadXML()) depend on libxml’s global state to decide whether or not to load “external entities.” These are special pieces of XML that can tell the program to read from a file, a URL, or other resources. By default, PHP turns these off for security.

Here’s where things get dangerous. If another library running in the same process (for example, ImageMagick using libxml when it processes SVG files) turns on “external entities” and doesn’t turn it off, PHP will happily load external entities too on the next XML operation. That can lead to exposure of local files, like password files, environment variables, or other private data.

PHP process starts and global libxml state has external entities loading *disabled* (safe).

2. Your web app calls a PHP function that triggers ImageMagick to process an SVG. Internally, ImageMagick enables external entities in libxml to read referenced files in that SVG.

ImageMagick does not restore the original (safe) setting—or restores it incorrectly.

4. In the same process, later, a user uploads XML data for parsing (e.g., via simplexml_load_string()), and now PHP reads the XML with external entity loading enabled—not what the developer expected!
5. An attacker can now provide malicious XML referencing sensitive files. These are read and returned or used by the PHP code.

This *vulnerable state can persist for several requests* until the PHP process is restarted.

Let’s see what this could look like, in simple PHP code

// Example: vulnerable code fragment in PHP
if ($_SERVER['REQUEST_METHOD'] === 'POST') {
    $xml = $_POST['xml'];
    $data = simplexml_load_string($xml);
    // The application does something with $data
    echo htmlspecialchars($data->secret, ENT_QUOTES, 'UTF-8');
}

Under normal conditions, simplexml_load_string() blocks XML External Entities (XXE). But if libxml’s global state is tampered with before this is executed, the following “evil” XML, sent by an attacker, could expose the content of /etc/passwd:

<?xml version="1."?>
<!DOCTYPE data [
  <!ELEMENT data ANY >
  <!ENTITY secret SYSTEM "file:///etc/passwd" >
]>
<data>
  <secret>&secret;</secret>
</data>

If XXE is enabled, PHP will include the content of /etc/passwd in the output under <secret>. The attacker can thus read any file PHP can access, including database configs, API keys, or logs.

Why Is This Problem So Tricky?

- It’s Intermittent: You might only be vulnerable sometimes, depending on what other extensions or software are running in your PHP environment.
- Developers Can’t Easily Control It: The vulnerability relies on code *outside* the developer’s control changing the global state in ways that linger.
- It Survives Across Requests: In process-per-request setups (like PHP-FPM), a changed global state by one action can stay and endanger later, unrelated requests.

Responsible Disclosure and Fixes

The original advisory from PHP provides details about the timeline and the patch. The fix ensures that PHP’s XML functions explicitly set libxml’s security settings before every parse call, so external changes don’t “leak” into PHP’s context.

Vulnerable versions

- PHP 8..* before 8..30 (changelog)
- PHP 8.1.* before 8.1.22 (changelog)
- PHP 8.2.* before 8.2.8 (changelog)

Upgrade PHP to at least 8..30, 8.1.22, or 8.2.8 as appropriate for your system.

2. Review all third-party extensions—especially those handling images (like ImageMagick or GD) or documents.

Wherever possible, disable file access in XML parsing by explicitly setting the right flags

$xml = new DOMDocument();
$xml->loadXML($userXml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR | LIBXML_NONET); // be careful with these flags

- Or, disable external entities

libxml_disable_entity_loader(true); // Deprecated in PHP 8.+, but worth knowing for older versions

Conclusion

CVE-2023-3823 is a real-world example of how even seasoned, widely-used libraries like PHP can be tripped up by underlying C libraries and global state. It’s a reminder to always keep dependencies updated, remain skeptical of implicit security boundaries, and to test code in real, production-like environments.

References

- Debian Security Tracker: CVE-2023-3823
- PHP Security Advisory
- XSS/XXE Attack Examples
- PHP ChangeLogs

Stay safe. Keep your PHP and extensions patched, and always sanitize your inputs!

Timeline

Published on: 08/11/2023 06:15:00 UTC
Last modified on: 08/22/2023 20:07:00 UTC