CVE-2021-47449 addresses a subtle but serious deadlock in the Linux kernel's ICE driver (Intel Ethernet Controller). This vulnerability comes from improper locking when cleaning up transmit (Tx) timestamp tracking resources, which can cause kernel warnings and a potential system freeze (deadlock) – an issue mainly relevant to administrators running recent kernels on Intel network adapters.
In this post, let’s break down what caused the bug, how it was fixed, provide some “for the curious” code, and explain how to know if you’re affected.
What is ICE and Tx Timestamp Tracking?
The ice module controls many modern Intel network cards. Network adapters that support hardware timestamps need to keep track of which out-going packets are awaiting a hardware timestamp callback. When the driver or device is being removed, any packets left in these queues have to be properly cleaned up—that’s where this bug appears.
How did CVE-2021-47449 Happen?
A particular function in the ICE driver is responsible for “flushing” the transmit timestamp tracker. The goal is to safely release all the leftover SKBs (socket buffers, representing packets) and prep the device for removal.
A commit (4ddd5c33c3e) added a _spinlock_ to make sure this flush routine wasn’t interrupted. But the critical mistake:
It held that spinlock across a function call (ice_clear_phy_tstamp) that goes on to take a mutex—a type of lock that can sleep.
In Linux, sleeping locks (mutexes) can only be used in non-atomic context. Using a spinlock (which must not sleep) together with a sleeping lock leads to warnings like:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573
in_atomic(): 1, irqs_disabled(): , non_block: , pid: 310, name: rmmod
This is a classic mistake. If the mutex actually de-schedules (that is, sleeps to wait for another part of the program), the kernel is left hanging because it’s supposed to be in an “atomic” (can’t sleep) section.
Users reported seeing stack traces like this in their kernel logs ([full example above](#))
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573
[...]
CPU: 52 PID: 310 Comm: rmmod Tainted: G W OE 5.15.-rc4+
[...]
ice_clear_phy_tstamp+x2c/x110 [ice]
ice_ptp_release+x408/x910 [ice]
ice_remove+x560/x6a [ice]
Here’s a distilled version of the broken section
spin_lock_bh(&pf->ptp.track_lock); // Takes a spinlock
ice_clear_phy_tstamp(pf); // Calls a function that can sleep!
...
spin_unlock_bh(&pf->ptp.track_lock);
Inside ice_clear_phy_tstamp, the code tries to take a mutex
mutex_lock(&pf->phy_mutex);
/* ... communicate with device firmware ... */
mutex_unlock(&pf->phy_mutex);
This is not allowed. You can’t hold a spinlock when taking a mutex, because the mutex may sleep.
The fix is simple, but crucial.
Only hold the spinlock while examining or modifying shared data. Release it before calling any sleeping functions. The corrected code looks like:
spin_lock_bh(&pf->ptp.track_lock);
/* manipulate shared data here */
spin_unlock_bh(&pf->ptp.track_lock);
ice_clear_phy_tstamp(pf); // this is now safe: no spinlock held here!
The patch landed in the kernel:
> ice: fix locking for Tx timestamp tracking flush
> Move ice_clear_phy_tstamp() call out of spinlock section.
Can This be Exploited?
This issue could not be exploited for privilege escalation or code execution directly.
However, it could lead to a system hang or improper device removal – resulting in a denial of service situation if, for example, an attacker could trigger repeated driver unloads.
If your systems depend on high-availability networking, that is a non-trivial risk.
You use Intel network cards with the ice module loaded
- You remove/unload/reload the driver (e.g., with rmmod ice) or perform device hot-unplug
If present, check your kernel version: uname -r
If it matches, check dmesg or your logs for warnings like shown earlier.
How to Patch
- Upgrade your kernel to v5.15.1 or newer (or any version carrying the patch backported)
On some distros, upgrade your linux-firmware and reboot.
No configuration change is needed; the fix is within the driver itself.
References
- Kernel.org git commit: ice: fix locking for Tx timestamp tracking flush
- CVE-2021-47449 at NVD
- Linux kernel bugzilla
- Exploit details discussion
Final Thoughts
CVE-2021-47449 is a perfect example of how tricky locking errors in kernel drivers can cause real headaches, sometimes as severe as a full system lock-up. While not “remotely exploitable,” the fix is important for any production system using Intel Ethernet with hardware timestamping.
Remember: Always be careful mixing locking primitives, and keep kernel modules up to date — especially if you handle hot-plug or device removal!
Timeline
Published on: 05/22/2024 07:15:10 UTC
Last modified on: 08/08/2024 15:35:01 UTC