In early 2024, a subtle but potentially dangerous vulnerability was discovered in Python’s standard library, specifically within the widely used urllib.parse.urlsplit() and urlparse() functions. Labeled CVE-2024-11168, this vulnerability impacts how these functions handle URLs containing bracketed hosts (like [example]). This long read will break down exactly what went wrong, why it matters, and how smart attackers could exploit your server using this bug—especially when your code interacts with multiple URL parsers.
[References & Further Reading](#references)
1. What’s the Vulnerability?
CVE-2024-11168 affects Python’s urllib.parse.urlsplit() and urlparse() functions. These functions failed to properly validate hostnames inside square brackets, meaning they would accept square-bracketed hosts in URLs even when those hosts were not valid IPv6 or IPvFuture addresses.
This is not compliant with RFC 3986, which only allows hostnames in brackets if they’re valid IPv6 or IPvFuture addresses.
Why is this dangerous?
If your application passes a URL parsed by one library (e.g., Python’s urlparse()) into another library (written in Go, Java, JavaScript, etc.), then these two parsers might interpret the host part differently. This can lead to server-side request forgery (SSRF), where malicious input tricks the server into making internal network requests.
According to RFC 3986, Section 3.2.2
> A host identified by an IP literal address is enclosed in square brackets.
>
> host = IP-literal / IPv4address / reg-name
>
> IP-literal = "[" ( IPv6address / IPvFuture ) "]"
This means, for example
- Valid: http://[::1]/ (IPv6 loopback)
- Invalid: http://[internal]/ (not IPv6 or IPvFuture)
But Python's urlparse() would wrongly accept both.
Here’s how it looks in practice
from urllib.parse import urlsplit
result = urlsplit("http://[not-an-ip]/path";)
print(result)
print(result.hostname)
Before the fix, this prints
SplitResult(scheme='http', netloc='[not-an-ip]', path='/path', query='', fragment='')
[not-an-ip]
But [not-an-ip] is not a valid IPv6 or IPvFuture address. A strict parser should reject this or interpret it differently.
Where it hurts
If you check for “internal” hosts by parsing the hostname and looking for keywords like localhost or 127...1, and an attacker sends http://[localhost]/, your Python code will see the host as [localhost], not localhost. If another system strips the brackets and treats it as local, you’ve got a bypass.
4. Real Attack Scenarios: SSRF
Server Side Request Forgery (SSRF) is when attackers trick your server into making internal HTTP calls—often breaching network boundaries. This vulnerability makes it possible by allowing mismatched parsing between libraries.
Scenario
1. User submits a URL: http://[localhost]/admin
2. Your Python backend inspects the host using urlparse(), doesn’t see obvious issues, and passes the URL along.
3. Another backend library or proxy (in Go, Java, etc.) parses [localhost] as localhost (internal address) and makes a privileged request.
The result: Attackers reach internal services, cloud metadata endpoints, or other IPs never meant to be exposed to the Internet.
5. Exploit Walkthrough
Here’s a simple proof of concept showing how a naive Python check can be bypassed.
Check that rejects “localhost”, but not “[localhost]”
from urllib.parse import urlparse
def is_host_blacklisted(url):
host = urlparse(url).hostname
return host in ("127...1", "localhost")
# Attacker supplies bracketed host
test_url = "http://[localhost]/secret"
print(is_host_blacklisted(test_url)) # False
# But another system may resolve "[localhost]" as just "localhost"!
What could an attacker do?
- Steal AWS/GCP metadata using: http://[169.254.169.254]/
- Reach forbidden internal dashboards: http://[internal-service]/
Exploit cloud or container APIs
6. How to Fix and Protect Yourself
Python’s upstream patch (see CPython GH-115390) tightens validation so only valid IPv6 and IPvFuture addresses are accepted inside square brackets.
What should you do?
- Upgrade Python to a patched version (see release notes for 3.12.x, 3.11.x, 3.10.x in Python Security Announcements).
- Always validate netloc/hostname using strict whitelists and RFC3986 checking, not ad-hoc checks.
- If you must process URLs across multiple languages/libraries, use uniform parsing routines or sanitize input clearly.
Watch out for brackets! If you see unusual bracketed hosts, treat them as suspicious.
7. References & Further Reading
- Python Security Advisory: urllib.parse* accepts bracketed non-IP hosts (CVE-2024-11168)
- CVE-2024-11168 at NVD
- RFC 3986: Uniform Resource Identifier (URI): Generic Syntax
- CPython Fix Commit
- Server-Side Request Forgery (SSRF) – PortSwigger
Summary:
If you’re using Python’s urllib.parse URL functions, upgrade now. Don’t assume Python will parse URLs strictly; always review how brackets, IPs, and hostnames are handled, especially if your system talks to internal services. Even common libraries can have surprising quirks—CVE-2024-11168 is a clear reminder of that.
Timeline
Published on: 11/12/2024 22:15:14 UTC
Last modified on: 01/06/2025 18:15:17 UTC