CVE-2023-32007 - Apache Spark UI Impersonation Vulnerability Enables Arbitrary Command Execution

*Last updated: June 2024*

Apache Spark is a popular, powerful big data processing engine used by thousands of companies. Like many other modern software platforms, Spark includes a web-based interface (the Spark UI) for monitoring and managing jobs. Securing this interface properly is critical, especially as Spark runs on many shared production clusters.

In this post, we’re going to break down the details of a major security issue—CVE-2023-32007—that affects the Spark UI. This one is extra important because, if left unpatched, it can let an attacker run any shell command with the privileges of your Spark service account. We’ll use plain English, plenty of examples, and give you an exclusive walkthrough of how this works, along with references to the original sources.

What is CVE-2023-32007?

CVE-2023-32007 is a security vulnerability in Apache Spark, mainly affecting versions before 3.4. that are now unsupported. The core problem lies in how Spark’s UI authentication filter (the HttpSecurityFilter) handles user impersonation, especially when using Access Control Lists (ACLs).

Affected Feature: Spark Web UI
Core Problem: Improper handling of user identity allows attackers to impersonate other users
Impact: Attacker can execute arbitrary shell commands as the user running Spark
CVSS: Critical (details further below)

Original References

- CVE-2023-32007 at NVD
- Apache Spark Security Advisory: CVE-2022-33891 *(original but slightly incorrect advisory)*
- Apache Spark Downloads and Release Notes

It starts when a system administrator enables ACLs with this option in spark-defaults.conf

spark.acls.enable=true

With ACLs on, Spark is supposed to check if a logged-in user actually has access to a given application or action via the Spark UI.

The Flaw in Authentication

The vulnerability happens because the HttpSecurityFilter did not properly verify the authenticity of the username being submitted. A user could just submit any username as part of a crafted request, and Spark would trust it. Spark would then run permission checks as if the request came from that user—even if the attacker is not actually authenticated as that user!

Let’s reimagine the risky code using pseudocode

def http_filter(request):
    if spark_acls_enabled:
        username = request.get("user")  # Vulnerable: trusts input
        if user_has_permissions(username):
            do_action()  # This can lead to command execution!

In reality, Spark trusts the user parameter it gets (from HTTP requests or cookies), but the attacker controls this parameter.

The Dangerous Chain: Impersonation → Shell Commands

In certain UI routes, Spark eventually passes the requested username into a permission check, and in the process, builds a shell command including user-controlled input. For example, if checking filesystem permissions or launching completion scripts.

If the attacker slips in user=; rm -rf /; # (or similar tricks), the backend will end up building, then running, a command like:

sudo -u ; rm -rf /; # -c <spark-script>

…which, as you can imagine, is disastrous. The attacker can execute anything as the Spark user.

Simple Exploit Example

Let’s suppose you have an old vulnerable Spark (prior to 3.4.), with spark.acls.enable=true, exposed UI, no extra access controls/proxies.

Craft an HTTP request to the Spark UI with a user parameter.

2. Set the parameter to something like attacker; touch /tmp/pwned; #.

Sample curl command

curl -k "https://spark.example.com:404/?user=attacker;touch /tmp/pwned;#"

If the Spark process has enough privileges, you’d see /tmp/pwned created—even though only an authenticated admin should have been able to do that!

Why Was This Missed Before?

This same class of vulnerability was previously reported as CVE-2022-33891. However, the initial fix and communication claimed Spark 3.1.3 wasn’t affected—but it turns out it *was*. As of now, all *vulnerable* versions (including 3.1.x) are no longer supported.

If you’re running a Spark version prior to 3.4.

- Upgrade to 3.4. or later immediately (Upgrade guide).

Run Spark as a non-root user with minimal shell access.

Remember, unsupported versions are dangerous because you won’t get future fixes.

Final Thoughts: Simple, Yet Devastating

CVE-2023-32007 is a perfect example of how simple authentication missteps—especially in web interfaces for big data products—can quickly turn into catastrophic breaches. If you’re running Spark in production, always prefer secure, up-to-date builds, and never expose admin interfaces directly to untrusted networks.

Key References

- National Vulnerability Database: CVE-2023-32007
- Apache Spark Upgrade Instructions
- Full Spark Release Notes

Stay vigilant and keep your data secured! 🚨

*This post was written exclusively for learners and practitioners concerned with Spark security.*

Timeline

Published on: 05/02/2023 09:15:00 UTC
Last modified on: 05/10/2023 20:16:00 UTC