If you’re building Ruby apps that parse XML or HTML, you probably use Nokogiri—a fast, feature-rich open source library trusted by thousands of developers. But did you know that, until recently, you could crash your server with a single bad value passed to Nokogiri’s parsers?
In this article, I’ll walk you through CVE-2022-29181: what happened, how an attacker could exploit it, and how you can protect your applications. We’ll look at real code, the reasons behind the bug, and the fixes—using simple, clear language.
What is CVE-2022-29181?
CVE-2022-29181 is a security flaw discovered in Nokogiri versions before 1.13.6. This bug affects how Nokogiri handles input passed to its XML and HTML4 SAX parsers—the type-checking was insufficient, so if you fed it a specially crafted object (not just a plain string), you could cause the Ruby interpreter to access memory it shouldn’t. This could lead to:
Reading data from other parts of memory
While this attack doesn’t let someone run arbitrary code, reading from memory or crashing a server can still be pretty serious—especially if your application processes untrusted data from users or the internet.
Apps that parse XML or HTML using the SAX interface, and parse untrusted input
If you’re just parsing static files or trusted internal data, you’re less at risk—but it’s always smart to keep your dependencies up to date.
How could an exploit work?
Let’s say you’re running a Rails (or any Ruby) app that takes some user-submitted XML for processing. A user (or attacker) sends you a non-String object, like an integer or custom object, but with valid Ruby methods you’d expect on a string. When Nokogiri’s parser tries to read that data, it might read from somewhere unexpected in memory.
Here’s a simplified proof-of-concept exploit
require 'nokogiri'
# Imagine this is untrusted user input
class EvilInput
def to_str
nil
end
# to_s returns a number, not a string!
def to_s
123456
end
end
# Nokogiri expects a String, but gets EvilInput
input = EvilInput.new
# This will crash Nokogiri in affected versions!
parser = Nokogiri::XML::SAX::Parser.new(Class.new)
parser.parse(input) # => segfaults on < 1.13.6
Running the above on an affected version can crash the interpreter, which is a denial of service.
The Fix: 1.13.6 and above
The Nokogiri team patched this in version 1.13.6:
> “All input is now type-checked as a String. Non-String input raises a TypeError.”
That means, if an attacker tries to pass a weird object, Nokogiri will refuse to process it instead of accidentally reading or writing the wrong memory.
Example of safe behavior in 1.13.6+
parser = Nokogiri::XML::SAX::Parser.new(Class.new)
begin
parser.parse(EvilInput.new)
rescue TypeError => e
puts "Type error caught! Attack prevented."
end
Workaround For Older Versions
Can't upgrade? You can mitigate the risk by making sure the parser gets only a plain, safe Ruby String for input:
trusted_input = user_supplied_data.to_s # Convert input to a String
parser.parse(trusted_input)
Avoid passing user objects or data structures directly—always sanitize to a real String!
Key Takeaways
- Always sanitize untrusted input. Don’t let places expecting a String accidentally get something else.
- Update your gems! The fixed version is Nokogiri 1.13.6.
- Don’t ignore segfaults: If your app crashes, check that your dependencies are up to date, especially when parsing user data.
References (and further reading)
- CVE-2022-29181 at NVD
- Nokogiri GitHub Security Advisory
- Upgrade Announcement / Release Notes 1.13.6
- RubyGems Nokogiri
Bottom line: Type errors are not just a matter of good coding—they can lead to real security flaws. CVE-2022-29181 is a great example of why libraries need to be careful about assumptions, and why developers should keep a close watch on their dependencies.
If you use Nokogiri, check your version, update fast, and always validate your inputs. Safe coding!
Timeline
Published on: 05/20/2022 19:15:00 UTC
Last modified on: 08/15/2022 11:20:00 UTC