Talend Data Catalog is a popular data governance platform used by organizations to capture, manage, and discover data assets. However, a serious vulnerability, assigned as CVE-2023-26264, was identified that could let attackers read sensitive files or interact with internal systems through a simple XML payload in a license file. If you’re using any version before 8.-20220907, your system could be at risk.
In this deep dive, we’ll explain what the weakness is, show how it works, and share references for further reading.
What Is CVE-2023-26264?
CVE-2023-26264 is a security bug in Talend Data Catalog’s license parsing code. It stems from improper handling of XML input (specifically, XML External Entity resolution), commonly called an XXE attack.
Talend’s license system consumes XML files. If the underlying XML parser is not configured safely, attackers can craft malicious license files that instruct the server to load external resources like local files or even remote URLs. This gives attackers a potential pathway to internal secrets, configuration files, and more.
Who Is Affected?
All versions of Talend Data Catalog before 8.-20220907 are vulnerable. Both on-premise and cloud installations are potentially at risk, depending on who can upload or submit license files.
How Does The Attack Work?
XXE attacks exploit a feature of XML called “external entities.” By embedding certain XML tags, the attacker can smuggle a reference to arbitrary files or URLs into the input consumed by the XML parser.
Here’s a simplified malicious license file
<?xml version="1."?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<license>
<serial>&xxe;</serial>
<customer>ExampleCorp</customer>
</license>
If the application parses this XML without proper security settings, the &xxe; will be replaced with the contents of /etc/passwd (on Linux systems)—or any other file path specified. The attacker simply uploads this as their custom license file.
Read arbitrary files: Harvest secrets, config files, credentials, key files from the server.
- SSRF (Server Side Request Forgery): Make requests to internal-only services you normally can’t reach from outside the network.
Demo: Exploiting XXE in Talend Data Catalog License
If you had access to upload a license, your POST request to the Talend server (e.g. /api/license/upload) might include your malicious XML above.
The vulnerable code is likely calling Java’s DocumentBuilderFactory without disabling DTD and external entity processing:
// Pseudocode - bad example!
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(uploadedLicenseFile);
How To Fix:
Set security features to turn off external entity resolution
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Good practice: disallow DTDs and XXE
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl";, true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities";, false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities";, false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(uploadedLicenseFile);
Talend Security Advisory:
https://nvd.nist.gov/vuln/detail/CVE-2023-26264
XXE Explanation & Exploitation (PortSwigger):
https://portswigger.net/web-security/xxe
Talend Release Notes (8.-20220907):
https://help.talend.com/r/en-US/8./security-updates-data-catalog
OWASP XXE Cheat Sheet:
https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
Restrict who can upload license files. Only let trusted admins access license features.
3. Monitor server logs and file access for signs of unusual or failed file access within license parsing routines.
4. Disable DTD and external entity parsing in any custom or third-party XML code your organization uses.
Final Thoughts
CVE-2023-26264 is a reminder of the dangers of XML parsing. Even a straightforward feature like license verification can become a backdoor to sensitive data if good security defaults are not enforced. If you’re running Talend Data Catalog and haven’t checked your version in a while, now is the time to verify—and patch!
Stay safe and update your software. If you want to learn more, dig into the reference links or try reproducing in a safe test environment.
*This post was written exclusively for readers seeking plain English explanations of complex vulnerabilities. Questions welcome!*
Timeline
Published on: 04/13/2023 19:15:00 UTC
Last modified on: 04/21/2023 04:19:00 UTC