Expat, known formally as the Expat XML parser (libexpat), is a core library that helps many applications process XML data. It’s used everywhere—from desktop software to web services. That’s why security flaws in Expat are extra dangerous.
In December 2021, a new vulnerability was found: CVE-2022-22822. It’s rooted in an integer overflow bug inside the addBinding function in the xmlparse.c file. If an attacker exploits this, it can crash applications or even let them run malicious code. Let's break this down and see how it works behind the scenes.
What’s the Issue?
Put simply, the problem happens when Expat is asked to process specially-crafted XML data. The parsing code calculates how much memory to use for certain XML entity bindings. If the XML is tricked to calculate too much, the number overflows—meaning the value wraps around to a small number. Because of this, Expat allocates way less memory than needed. When it then tries to write data to this too-small chunk, it overwrites memory it shouldn’t—opening the door to exploitation.
The Code Problem
Here’s a simplified version of what happens in the vulnerable code. (Original code is found here.)
static int
addBinding(XML_Parser parser,
PREFIX *prefix,
const ATTRIBUTE_ID *attId,
const XML_Char *uri,
BINDING **bindingsPtr)
{
size_t uriLen = ;
size_t count = ;
// Calculate the length of the URI
if (uri) {
const XML_Char *u = uri;
while (*u++ != '\') uriLen++;
}
// Calculate the size we need
size_t alloc_size = sizeof(BINDING) + (uriLen + 1) * sizeof(XML_Char);
// Overflow happens here if uriLen is huge
BINDING *binding = (BINDING *)malloc(alloc_size);
if (!binding) return ; // allocation failed
// ...copy data into the object...
*bindingsPtr = binding;
return 1;
}
If uriLen is a huge value (e.g., crafted by an attacker in an XML file), the calculation of alloc_size can “wrap around” and become a small number instead—because of how C handles size_t overflows. Memory is then allocated far smaller than it should be.
How Could Someone Exploit This?
- An attacker crafts an XML file with a very long attribute value that triggers huge values for uriLen.
- The XML file is sent to an application using Expat (this could be via a file upload, web request, or internal message—wherever Expat is used).
When Expat parses the XML, the overflow happens and not enough memory is allocated.
- The code uses memcpy or similar to copy/namespaces after this allocation, overrunning the memory and potentially allowing the attacker to overwrite adjacent memory or crash the program.
Here’s a Proof of Concept XML (simplified—might require fine-tuning)
<root xmlns:AAAAAAAA...AAA="http://example.com/">;
<element />
</root>
References
- CVE-2022-22822 NVD Entry
- libexpat Release Notes
- GitHub fix for CVE-2022-22822
- oss-sec Discussion
How Was It Fixed?
Developers patched xmlparse.c to check for overflows before doing memory allocation calculations.
Patched logic looks like this
if (uriLen > (SIZE_MAX - sizeof(BINDING)) / sizeof(XML_Char) - 1) {
// Error! Return failure before overflow
return ;
}
This code makes sure uriLen doesn’t push past the safe space of size_t limits.
Practical Impact
While not “RCE by default,” this kind of flaw is a gold mine for attackers looking to crash programs (DoS), or—combined with other bugs—get code execution. Many internet services, security products, and tools use Expat under the hood, so patching is urgent.
What Should You Do?
- Update Expat: If you're shipping software, make sure you are running at least Expat 2.4.3 or later.
- Patch Libraries: If you use a programming language or tool that embeds Expat (like Python, Perl, PHP), check upstream for patches.
- Audit Your Apps: Look for XML parsing in your code, and ensure you're not using vulnerable versions!
Final Note
This bug is a perfect example of why input validation and careful arithmetic are non-negotiable in low-level code. Integer overflows—often overlooked—can turn safe programs into attack vectors. Stay safe and always keep your dependencies up to date!
Timeline
Published on: 01/10/2022 14:12:00 UTC
Last modified on: 06/14/2022 11:15:00 UTC