If you’re working with spreadsheets in Java, you might know about Excel-Streaming-Reader. It’s a handy library that lets you read really large Excel files without loading them completely into memory. Under the hood, it uses Apache POI for processing Excel files (especially .xlsx). But back in early 2022, a pretty serious security issue popped up that was fixed in version 2.1. of xlsx-streamer.
Let’s break down what happened, why it matters, and what you should do—complete with real code samples and references.
🚨 What is CVE-2022-23640?
CVE-2022-23640 is a vulnerability found in Excel-Streaming-Reader (which uses xlsx-streamer) before version 2.1.. The issue is specifically a XML Entity Expansion bug, also called a “Billion Laughs” attack.
This flaw allows attackers to craft malicious Excel files containing special XML content. When you read one of these files, your Java app can be forced to use a huge amount of memory, crash, or even open other attack paths.
Here’s what the CVE entry says
> The XML parser used by Excel-Streaming-Reader before version 2.1. did not set all the necessary settings to prevent XML Entity Expansion issues. This makes it possible for an attacker to craft a file that causes a denial of service.
- Official CVE-2022-23640 entry
- Security advisory on GitHub
🛠️ How Does the Exploit Work? (With Code Snippet)
At the core, the bug is in how Excel-Streaming-Reader handled the parsing of Excel’s underlying XML files. If the parser is not properly configured, an attacker can use XML entities to make the parser expand data recursively. This can blow up your server’s memory.
Here’s an example chunk of malicious XML (these files are hidden inside .xlsx ZIP archive files)
<?xml version="1."?>
<!DOCTYPE foo [
<!ENTITY xxe "XXX">
]>
<worksheet>
<row>
<c>
<v>&xxe;</v>
</c>
</row>
</worksheet>
You can use more complex recursive entities to make the attack much worse. The parser, by default, will expand &xxe; and any other entities, causing exponential memory use.
When you use Excel-Streaming-Reader in your Java code, something as innocent-looking as this can be fatal if the XML parser isn’t safe:
import com.monitorjbl.xlsx.StreamingReader;
import java.io.FileInputStream;
import java.io.InputStream;
import org.apache.poi.ss.usermodel.Workbook;
public class Demo {
public static void main(String[] args) throws Exception {
InputStream is = new FileInputStream("billion-laughs.xlsx");
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100) // remember 100 rows at a time
.bufferSize(4096) // buffer of 4K
.open(is); // opens only a tiny part of the file at a time
// Your processing code...
}
}
If billion-laughs.xlsx is specially crafted by an attacker, the above code can crash your server or use lots of CPU/RAM!
Impact: High. Can cause Denial of Service (DoS) by exhausting system memory.
- Requirements: No authentication needed—the attacker just needs your app to open/crawl their Excel file.
If you’re curious about what a Billion Laughs XML payload looks like, here’s a detailed breakdown on OWASP.
Proof-of-Concept (PoC)
You can create a .xlsx file with crafted XML (for demo purposes only!) by unzipping a real one and stuffing in the above malicious XML. Then try loading it with an old Excel-Streaming-Reader or xlsx-streamer (before 2.1.). You’ll soon see your Java process’s memory spike.
Maven
<dependency>
<groupId>com.monitorjbl</groupId>
<artifactId>xlsx-streamer</artifactId>
<version>2.1.</version>
</dependency>
Or, if you rely on Excel-Streaming-Reader, make sure it pulls in the latest xlsx-streamer.
🧰 What Did the Patch Change?
The patch changes how the XML parser is set up under the hood. Now, it explicitly disables XML entity expansion and other dangerous XML features, following security best practices:
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setFeature("http://xml.org/sax/features/external-general-entities";, false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities";, false);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl";, true);
This makes XML parsing safe from expansion attacks.
📝 References and More Reading
- CVE-2022-23640 official entry
- xlsx-streamer 2.1. release notes
- GitHub security advisory
- OWASP XML External Entity (XXE) Prevention Cheat Sheet
Stay Informed: Watch for security advisories on libraries you depend on.
TL;DR:
If your app lets users upload or open Excel files, or if you process files from untrusted sources, this bug could take it offline. Upgrade to xlsx-streamer 2.1.+ right now!
Timeline
Published on: 03/02/2022 20:15:00 UTC
Last modified on: 03/09/2022 18:01:00 UTC