CVE-2022-41137 - Practical Guide to Exploiting Apache Hive Metastore RCE via Unsafe Deserialization

In 2022, a severe vulnerability was discovered in Apache Hive Metastore (HMS): CVE-2022-41137. This security issue can allow an attacker to remotely execute code on the Metastore server, thanks to unsafe Java deserialization. If you operate Hive Metastore in your stack, you’ll want to understand what went wrong, how it can be attacked, and what you should do next.

What Causes CVE-2022-41137?

It all starts with Hive’s Metastore server—a service responsible for storing Hive’s metadata. Internally, HMS uses a Java method called SerializationUtilities#deserializeObjectWithTypeInformation when handling partition information. The problem is this method does not check what kinds of objects it’s deserializing. If someone feeds it untrusted serialized data, it will gladly deserialize it—potentially running malicious code in the process.

Below is the code pattern you’ll find in SerializationUtilities.java:

public static Object deserializeObjectWithTypeInformation(byte[] objectData) 
        throws IOException, ClassNotFoundException {
    ByteArrayInputStream bais = new ByteArrayInputStream(objectData);
    ObjectInputStream ois = new ObjectInputStream(bais);
    // Unsafe: Reads any serialized object, no checks!
    return ois.readObject();
}

Any authenticated user who talks to HMS and manages to pass their own crafted serialized object to this function can trigger arbitrary code execution on the server.

What Are the Attack Requirements?

- Authenticated User: To reach the vulnerable code, you need to have access credentials to HMS (usually via Kerberos, LDAP, etc.).
- Ability to Send Requests: The attacker must form requests to HMS endpoints that leverage deserialization.
- API Exposure: If any API handler or custom extension uses this method without validating input beforehand, it’s susceptible.

Note: The bug cannot be exploited anonymously or from the public internet—there must be authentication first.

Exploiting the Vulnerability: Step By Step

To understand practical exploitation, let’s break it down with a simple example—assuming you already have valid user credentials to HMS.

1. Crafting a Malicious Serialized Payload

First, the attacker creates a Java “gadget chain” (using libraries like ysoserial) to generate a payload. For example, to run touch /tmp/pwned on the server:

java -jar ysoserial.jar CommonsCollections6 "touch /tmp/pwned" > exploit.ser

This generates a malicious object using known gadget classes.

Next, send your payload to HMS by calling an API endpoint or internal method (varying by deployment)

- If there’s a method like filterAndFetchPartitions or similar that accepts arbitrary objects, pass the serialized byte array as input.
- For Hive deployments using Thrift, an attacker may use the metastore Thrift API to transmit the payload.

Example pseudo-code to send malicious bytes

byte[] payload = Files.readAllBytes(Paths.get("exploit.ser"));
SomeHMSApi.filterAndFetchPartitions(payload);

The precise route depends on which endpoints in your setup accept serialized data without checks.

3. Code Execution

As soon as the server’s JVM tries to deserialize the object, it executes the payload, launching the desired command.

References

- Apache Hive Security Advisory for CVE-2022-41137
- NVD Database: CVE-2022-41137
- Hive PR Fixing the Vulnerability
- ysoserial Tool for Gadget Payloads

Never Trust Serialized Input:

Always validate and sanitize what gets deserialized, and do not expose raw Java object deserialization as a public or even authenticated API.

Restrict Metastore Access:

Use firewalls, network segmentation, and least-privilege access controls so only trusted nodes or users can talk to HMS.

Monitor Logs:

Watch for suspicious API usage, especially requests with unexpected content lengths or unrecognized class types.

Summary

CVE-2022-41137 is a reminder: never blindly deserialize data in Java—especially from users, even “trusted” ones. While the bug only affects authenticated users, in large clusters or shared-platforms that’s plenty dangerous. Security patches and careful API hygiene are your best defenses.

Stay safe out there—and always keep your Hadoop and Hive components updated.

*Written exclusively for you; if sharing, please credit the author and link to official advisories above.*

Timeline

Published on: 12/05/2024 10:15:04 UTC
Last modified on: 12/05/2024 17:15:07 UTC