If you’re building Ruby apps that parse XML or HTML, you probably use Nokogiri—a fast, feature-rich open source library trusted by thousands of developers. But did you know that, until recently, you could crash your server with a single bad value passed to Nokogiri’s parsers?  

In this article, I’ll walk you through CVE-2022-29181: what happened, how an attacker could exploit it, and how you can protect your applications. We’ll look at real code, the reasons behind the bug, and the fixes—using simple, clear language.

What is CVE-2022-29181?

CVE-2022-29181 is a security flaw discovered in Nokogiri versions before 1.13.6. This bug affects how Nokogiri handles input passed to its XML and HTML4 SAX parsers—the type-checking was insufficient, so if you fed it a specially crafted object (not just a plain string), you could cause the Ruby interpreter to access memory it shouldn’t. This could lead to:

Reading data from other parts of memory

While this attack doesn’t let someone run arbitrary code, reading from memory or crashing a server can still be pretty serious—especially if your application processes untrusted data from users or the internet.

Apps that parse XML or HTML using the SAX interface, and parse untrusted input

If you’re just parsing static files or trusted internal data, you’re less at risk—but it’s always smart to keep your dependencies up to date.

How could an exploit work?

Let’s say you’re running a Rails (or any Ruby) app that takes some user-submitted XML for processing. A user (or attacker) sends you a non-String object, like an integer or custom object, but with valid Ruby methods you’d expect on a string. When Nokogiri’s parser tries to read that data, it might read from somewhere unexpected in memory.

Here’s a simplified proof-of-concept exploit

require 'nokogiri'

# Imagine this is untrusted user input
class EvilInput
  def to_str
    nil
  end

  # to_s returns a number, not a string!
  def to_s
    123456
  end
end

# Nokogiri expects a String, but gets EvilInput
input = EvilInput.new

# This will crash Nokogiri in affected versions!
parser = Nokogiri::XML::SAX::Parser.new(Class.new)
parser.parse(input)  # => segfaults on < 1.13.6

Running the above on an affected version can crash the interpreter, which is a denial of service.

The Fix: 1.13.6 and above

The Nokogiri team patched this in version 1.13.6:

> “All input is now type-checked as a String. Non-String input raises a TypeError.”

That means, if an attacker tries to pass a weird object, Nokogiri will refuse to process it instead of accidentally reading or writing the wrong memory.

Example of safe behavior in 1.13.6+

parser = Nokogiri::XML::SAX::Parser.new(Class.new)
begin
  parser.parse(EvilInput.new)
rescue TypeError => e
  puts "Type error caught! Attack prevented."
end

Workaround For Older Versions

Can't upgrade? You can mitigate the risk by making sure the parser gets only a plain, safe Ruby String for input:

trusted_input = user_supplied_data.to_s  # Convert input to a String
parser.parse(trusted_input)

Avoid passing user objects or data structures directly—always sanitize to a real String!

Key Takeaways

- Always sanitize untrusted input. Don’t let places expecting a String accidentally get something else.
- Update your gems! The fixed version is Nokogiri 1.13.6.
- Don’t ignore segfaults: If your app crashes, check that your dependencies are up to date, especially when parsing user data.

References (and further reading)

- CVE-2022-29181 at NVD
- Nokogiri GitHub Security Advisory
- Upgrade Announcement / Release Notes 1.13.6
- RubyGems Nokogiri


Bottom line: Type errors are not just a matter of good coding—they can lead to real security flaws. CVE-2022-29181 is a great example of why libraries need to be careful about assumptions, and why developers should keep a close watch on their dependencies.

If you use Nokogiri, check your version, update fast, and always validate your inputs. Safe coding!

Timeline

Published on: 05/20/2022 19:15:00 UTC
Last modified on: 08/15/2022 11:20:00 UTC