Libexpat is one of the most widely used XML parsing libraries, baked into software and systems across the internet. When a vulnerability is found in such a core component, the impact can be serious—even catastrophic if left unpatched. In this exclusive, we break down CVE-2022-22826, an integer overflow vulnerability found in the nextScaffoldPart function of xmlparse.c in Expat (before version 2.4.3). We’ll show code snippets, discuss possible exploitation, and point you toward official references.

What is CVE-2022-22826?

CVE-2022-22826 is an integer overflow vulnerability in the Expat (libexpat) XML parser. The bug lives in the nextScaffoldPart function within the file xmlparse.c. Before version 2.4.3, this code could be tricked into allocating or indexing arrays incorrectly, sometimes leading to heap corruption, overflows, or possible code execution.

Why It Matters

Expat is embedded in desktop apps, web services, embedded devices, and cloud infrastructure. Any service that parses malicious XML could be a target if it's using a vulnerable version.

What is Integer Overflow?

In simple terms, integer overflow happens when a calculation goes beyond the maximum (or minimum) number a variable can hold. For example, if an unsigned 32-bit integer goes above xFFFFFFFF, it wraps around to zero.

If an attacker can control input values that get multiplied, added, or otherwise processed, this can sometimes be used to allocate way too little or too much memory, cause memory corruption, or trigger dangerous behavior.

Location: The Flawed Function

The problematic code was in xmlparse.c, inside the function nextScaffoldPart. Let’s take a look at a simplified version (from the official patch):

int nextScaffoldPart(XML_Parser parser) {
    if (scaffold->partSize == scaffold->partAlloc) {
        /* Calculate new allocation size */
        size_t new_size = scaffold->partAlloc * 2;
        ScaffoldPart *newParts;

        /* Vulnerability: If partAlloc is large, new_size overflows! */
        newParts = (ScaffoldPart *)REALLOC(scaffold->parts,
                                           new_size * sizeof(ScaffoldPart));
        if (!newParts)
            return -1;

        scaffold->parts = newParts;
        scaffold->partAlloc = new_size;
    }
    /* ... */
    return ;
}

How an Attacker Could Craft Exploit

If an attacker crafts an XML document with a structure that causes many recursive elements, the partAlloc variable grows rapidly. Eventually, the multiplication during heap reallocation can overflow.

Suppose sizeof(ScaffoldPart) == 32 and partAlloc grows near SIZE_MAX / 32; then new_size * sizeof(ScaffoldPart) wraps around, allocating little or no memory.

If the program then writes into what it thinks is a big array, it may corrupt memory, possibly leading to a crash or even arbitrary code execution—especially in environments where heap exploitation is practical.

Real-World Impact

- Expat is widely used: You’ll find it in Python, Ruby, PHP, Apache, many Linux distributions, and various embedded IoT products.
- RCE (Remote Code Execution) Potential: In the worst cases, attackers could gain execution on a target parsing a malicious XML file.

Example Exploit Flow

<!DOCTYPE root [
<!ELEMENT root (child+)>
<!-- Define many recursive elements to force deep scaffolds -->
<!ELEMENT child (#PCDATA|child)*>
]>
<root>
  <!-- thousands of nested <child> elements here -->
</root>


Parsing this XML could potentially trigger the vulnerable path if not patched.

How to Protect Yourself

1. Update Immediately: Upgrade to Expat 2.4.3 or any later version. Download from the official releases.

References

- NVD Entry for CVE-2022-22826
- Github libexpat commit fixing the bug
- Security Release Notes
- OSS Security mailing list discussion

Final Thoughts

CVE-2022-22826 is a good example of how subtle logic errors with memory and integer calculations can have big impacts, especially in code that gets everywhere like an XML parser. Patch fast, monitor advisories, and consider fuzzing critical parsing code in your projects.


*Stay tuned for more breakdowns of critical vulnerabilities in the software supply chain. If your environment uses libexpat, don’t wait—patch today!*

Timeline

Published on: 01/10/2022 14:12:00 UTC
Last modified on: 06/14/2022 11:15:00 UTC