In recent updates to the Linux kernel, a vulnerability has been resolved which may cause stack corruption. This post details the vulnerability (CVE-2024-26586) and the fix that has been implemented to overcome it.

The Vulnerability

The vulnerability resides in the mlxsw driver, which is a driver for Mellanox Ethernet switches in the Linux kernel. Specifically, it is in the mlxsw_sp_acl_tcam_group_update function in the spectrum_acl_tcam part of the driver. The issue occurs when configuring ACL groups (PAGT) on Spectrum-2 and newer ASICs. It is reported that the maximum number of ACLs in a group is more than 16, but the layout of the PAGT register has not been updated to account for this. This may lead to stack corruption when more than 16 ACLs in a group are required.

When this vulnerability is triggered, it may cause the system to crash, displaying a kernel panic message such as:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+x116/x120

The vulnerability is considered low-risk as the specific conditions required to trigger it are rare.

The Fix

To address this issue, the developers have implemented a fix to the mlxsw driver by limiting the maximum ACL group size to either what the firmware reports or the maximum ACLs that fit in the PAGT register, whichever is lesser.

The following code snippet shows the updated function in the spectrum_acl_tcam

static int mlxsw_sp_acl_tcam_group_update(struct mlxsw_sp *mlxsw_sp,
                          struct mlxsw_sp_acl_tcam_group *group)
{
    /* ... */
    u16 max_regions = min_t(u16, MLXSW_SP_ACL_TCAM_MAX_AG_GR,
                MLXSW_SP_ACL_TCAM_MAX_AG_GR_FW);
    /* ... */
}

Additionally, a test case has been added to ensure that the fix works correctly and the system does not crash under the conditions that previously caused stack corruption.

References

- Original commit with the fix for this issue in the Linux kernel: https://github.com/torvalds/linux/commit/a9becf1e9e07ce2d61b01dc72d146cb3e8d679d6

- In-depth discussion of the vulnerability and fix: https://lore.kernel.org/netdev/20220118165634.374515-1-idosch@idis.win.tue.nl/T/#u

In conclusion, the Linux kernel vulnerability CVE-2024-26586 has been effectively resolved by developers by imposing a limit on the maximum ACL group size to prevent stack corruption and system crashes. If you work with Mellanox Ethernet switches on the Linux kernel, make sure to update your system with the latest kernel that includes the fix for this vulnerability.

Timeline

Published on: 02/22/2024 17:15:08 UTC
Last modified on: 03/18/2024 18:12:44 UTC