PyMySQL, a popular pure-Python MySQL client, is widely used in web apps and scripts. In mid-2024, a fresh vulnerability—CVE-2024-36039—drew attention for enabling SQL injection attacks if the library is used with untrusted JSON data. Although PyMySQL offers helper functions to escape values, it does _not_ properly handle dict keys when those come from untrusted sources. Here’s what you need to know, with simple code and explanations.
What’s the Actual Problem?
If you take untrusted data (like JSON from a web request), put its keys and values into SQL queries using PyMySQL’s escape_dict, you assume both are safely escaped. But only the values get sanitized; keys are inserted directly.
When you unpack a dictionary to create dynamic SQL queries—like in a mass insert or an update—the column names (the keys) are not escaped at all. If an attacker can control the keys, they can break out of the query structure and inject malicious SQL.
Realistic Example Code
Imagine a Flask web API endpoint that lets clients update their database rows by POSTing JSON. Here’s a simplified snippet:
import pymysql
from flask import request
connection = pymysql.connect(...)
@app.route('/update', methods=['POST'])
def update():
data = request.get_json() # Untrusted JSON input from the user
user_id = data.get('id')
fields = data.get('fields', {}) # Untrusted dict
# NOTE: Using escape_dict for the field values
query = "UPDATE users SET "
query += ', '.join([f"{k}=%({k})s" for k in fields])
query += " WHERE id=%s"
cursor = connection.cursor()
cursor.execute(query, {**fields, "id": user_id}) # Only field values are escaped!
connection.commit()
return "OK"
Let’s say the attacker sends this JSON
{
"id": 123,
"fields": {
"name": "Alice",
"salary = WHERE id>; --": "XXX"
}
}
The constructed SQL query will look like this
UPDATE users SET name=%(name)s, salary = WHERE id>; --=%(salary = WHERE id>; --)s WHERE id=%s
But MySQL will parse it as
UPDATE users SET name='Alice', salary = WHERE id>; -- ='XXX' WHERE id=123
The attacker’s injected key (salary = WHERE id>; --) breaks the query and can modify more rows than intended, or even drop tables or leak data!
Why Did This Happen?
PyMySQL’s API only escapes values with escape_dict({}). The column names (keys of the dict) are used as-is, giving attackers a way to inject SQL when dict keys depend on user input.
Most snippet guides—especially in Python—show code just like above for dynamic queries, assuming JSON dict keys are innocent. But with APIs that allow arbitrary keys (or if you map JSON keys to columns), this bug gets dangerous.
Not validating or restricting which columns can be updated?
If any of the above are true, you are vulnerable. This is a very common bug in Python/Flask code.
1. Never Allow Untrusted Dict Keys in SQL
Whitelist fields you expect, _never_ build query parts from untrusted keys.
ALLOWED_FIELDS = {'name', 'salary', 'email'}
safe_fields = {k: v for k, v in fields.items() if k in ALLOWED_FIELDS}
query = "UPDATE users SET " + ', '.join([f"{k}=%({k})s" for k in safe_fields])
2. Upgrade PyMySQL (if applicable)
Check the PyMySQL GitHub or PyPI page for updates and patches. As of now, the proper fix is to sanitize column names elsewhere in your code.
You can use MySQL’s backticks
query = "UPDATE users SET " + ', '.join([f"{k}=%({k})s" for k in safe_fields])
But do not do this with untrusted keys unless you validated them, or use a strict whitelist.
References
- CVE-2024-36039 at NVD
- PyMySQL GitHub issue
- PyMySQL Docs: API
TL;DR
If you build SQL queries with dict keys from untrusted sources in PyMySQL, attackers can run arbitrary queries on your MySQL database.
Fix your code to strictly control the keys you use in SQL, upgrade dependencies frequently, and review your use of query helpers.
PyMySQL is popular for its ease, but dynamic SQL with bad input handling can be a nightmare. Smarter, safer code is the only defense. Stay safe!
Timeline
Published on: 05/21/2024 16:15:26 UTC
Last modified on: 11/21/2024 16:15:24 UTC