[PATCH] net/mlx4: fix MAC table total count corruption in __mlx4_unregister_mac()

Kery Qi posted 1 patch 2 weeks, 1 day ago
drivers/net/ethernet/mellanox/mlx4/port.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] net/mlx4: fix MAC table total count corruption in __mlx4_unregister_mac()
Posted by Kery Qi 2 weeks, 1 day ago
In __mlx4_unregister_mac(), when operating in mf_bonded mode
(SR-IOV with bonding), it appears that the code might be incorrectly
decrementing table->total instead of dup_table->total when cleaning
up the duplicate table entry.

If this is the case, it would cause the primary table's total counter
to be decremented twice (once for itself and once when it should
decrement the duplicate table), leading to counter corruption.
Over time, table->total could become negative, which would
break the "table->total == table->max" fullness check in
__mlx4_register_mac().

The registration path correctly increments both counters:
  ++table->total;
  if (dup) {
      ...
      ++dup_table->total;
  }

However, the unregistration path seems to have a typo:
  --table->total;
  if (dup) {
      ...
      --table->total; // Should this be --dup_table->total?

Fixes: 5f61385d2ebc2 ("net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode")
Signed-off-by: Kery Qi <qikeyu2017@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/port.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index e3d0b13c1610..6d0295c471da 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -410,7 +410,7 @@ void __mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
 		if (mlx4_set_port_mac_table(dev, dup_port, dup_table->entries))
 			mlx4_warn(dev, "Fail to set mac in duplicate port %d during unregister\n", dup_port);
 
-		--table->total;
+		--dup_table->total;
 	}
 out:
 	if (dup) {
-- 
2.34.1
Re: [PATCH] net/mlx4: fix MAC table total count corruption in __mlx4_unregister_mac()
Posted by Tariq Toukan 1 week, 4 days ago

On 22/01/2026 20:39, Kery Qi wrote:
> In __mlx4_unregister_mac(), when operating in mf_bonded mode
> (SR-IOV with bonding), it appears that the code might be incorrectly
> decrementing table->total instead of dup_table->total when cleaning
> up the duplicate table entry.
> 
> If this is the case, it would cause the primary table's total counter
> to be decremented twice (once for itself and once when it should
> decrement the duplicate table), leading to counter corruption.
> Over time, table->total could become negative, which would
> break the "table->total == table->max" fullness check in
> __mlx4_register_mac().
> 
> The registration path correctly increments both counters:
>    ++table->total;
>    if (dup) {
>        ...
>        ++dup_table->total;
>    }
> 
> However, the unregistration path seems to have a typo:
>    --table->total;
>    if (dup) {
>        ...
>        --table->total; // Should this be --dup_table->total?
> 
> Fixes: 5f61385d2ebc2 ("net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode")
> Signed-off-by: Kery Qi <qikeyu2017@gmail.com>
> ---

Hi Kery,

1. Commit message is phrased as an RFC, with questions and uncertainty.
Please re-phrase.
2. Do you hit an actual failure here? What are the steps? What error do 
you see?

Other than that, code LGTM.

>   drivers/net/ethernet/mellanox/mlx4/port.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
> index e3d0b13c1610..6d0295c471da 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/port.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/port.c
> @@ -410,7 +410,7 @@ void __mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
>   		if (mlx4_set_port_mac_table(dev, dup_port, dup_table->entries))
>   			mlx4_warn(dev, "Fail to set mac in duplicate port %d during unregister\n", dup_port);
>   
> -		--table->total;
> +		--dup_table->total;
>   	}
>   out:
>   	if (dup) {
Re: [PATCH] net/mlx4: fix MAC table total count corruption in __mlx4_unregister_mac()
Posted by Jakub Kicinski 1 week, 4 days ago
On Tue, 27 Jan 2026 08:26:49 +0200 Tariq Toukan wrote:
> 1. Commit message is phrased as an RFC, with questions and uncertainty.
> Please re-phrase.
> 2. Do you hit an actual failure here? What are the steps? What error do 
> you see?

Alternatively, perhaps, seeing that Kery is sending patches to random
pieces of code - please add a paragraph explaining what kind of tool
you used to detect the issue, and stating that you haven't actually
hit it. Please note that disclosure that the issue has been found
by static analysis tool is _required_ in the kernel process.
Re: [PATCH] net/mlx4: fix MAC table total count corruption in __mlx4_unregister_mac()
Posted by Jakub Kicinski 1 week, 4 days ago
On Fri, 23 Jan 2026 02:39:07 +0800 Kery Qi wrote:
> In __mlx4_unregister_mac(), when operating in mf_bonded mode
> (SR-IOV with bonding), it appears that the code might be incorrectly
> decrementing table->total instead of dup_table->total when cleaning
> up the duplicate table entry.
> 
> If this is the case, it would cause the primary table's total counter
> to be decremented twice (once for itself and once when it should
> decrement the duplicate table), leading to counter corruption.
> Over time, table->total could become negative, which would
> break the "table->total == table->max" fullness check in
> __mlx4_register_mac().
> 
> The registration path correctly increments both counters:
>   ++table->total;
>   if (dup) {
>       ...
>       ++dup_table->total;
>   }
> 
> However, the unregistration path seems to have a typo:
>   --table->total;
>   if (dup) {
>       ...
>       --table->total; // Should this be --dup_table->total?

Looks legit, Tariq? Are you trying to find/dust off an mlx4 card? :)