net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

[PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Zijing Yin 5 days, 17 hours ago

netif_rx_mode_run() fires netdev_WARN() when netif_addr_lists_snapshot()
returns non-zero. The only path to a non-zero return is -ENOMEM from
__hw_addr_create() failing its kmalloc(GFP_ATOMIC) -- syzkaller hits it
via failslab, and it can also be reached under real memory pressure.

This is the only allocator-failure site in this file that WARNs.
__hw_addr_create() itself, and every other caller of it in this file
(__hw_addr_add_ex(), dev_uc_add_excl(), dev_uc_add(), dev_mc_add(), ...)
just propagate the -ENOMEM silently. GFP_ATOMIC is a may-fail allocator;
callers are required to handle NULL, so a returned -ENOMEM here is an
expected runtime condition, not an invariant violation.

The miss is self-healing: any subsequent change to dev->uc / dev->mc
(every dev_{uc,mc}_{add,del}, IFF_PROMISC flip, etc.) calls
__dev_set_rx_mode() -> netif_rx_mode_queue(), which re-queues the device
and retries the sync. The only cost of a failed attempt is one stale
rx-mode window until the next update; nothing in the kernel relies on
this attempt succeeding.

Demote to net_err_ratelimited() so the condition stays observable in
dmesg without tripping panic_on_warn.

Reproducer (syzkaller .prog format with setup notes): 
https://pastebin.com/t7AQKx9v

Fixes: 3554b4345d85 ("net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work")
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
---
 net/core/dev_addr_lists.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
index d73fcb0c6..c6fdcac74 100644
--- a/net/core/dev_addr_lists.c
+++ b/net/core/dev_addr_lists.c
@@ -1275,7 +1275,8 @@ static void netif_rx_mode_run(struct net_device *dev)
 		err = netif_addr_lists_snapshot(dev, &uc_snap, &mc_snap,
 						&uc_ref, &mc_ref);
 		if (err) {
-			netdev_WARN(dev, "failed to sync uc/mc addresses\n");
+			net_err_ratelimited("%s: failed to sync uc/mc addresses\n",
+					    netdev_name(dev));
 			netif_addr_unlock_bh(dev);
 			return;
 		}
-- 
2.43.0

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Jakub Sitnicki 5 days, 15 hours ago

On Tue, May 19, 2026 at 02:55 AM -07, Zijing Yin wrote:
> netif_rx_mode_run() fires netdev_WARN() when netif_addr_lists_snapshot()
> returns non-zero. The only path to a non-zero return is -ENOMEM from
> __hw_addr_create() failing its kmalloc(GFP_ATOMIC) -- syzkaller hits it
> via failslab, and it can also be reached under real memory pressure.
>
> This is the only allocator-failure site in this file that WARNs.
> __hw_addr_create() itself, and every other caller of it in this file
> (__hw_addr_add_ex(), dev_uc_add_excl(), dev_uc_add(), dev_mc_add(), ...)
> just propagate the -ENOMEM silently. GFP_ATOMIC is a may-fail allocator;
> callers are required to handle NULL, so a returned -ENOMEM here is an
> expected runtime condition, not an invariant violation.
>
> The miss is self-healing: any subsequent change to dev->uc / dev->mc
> (every dev_{uc,mc}_{add,del}, IFF_PROMISC flip, etc.) calls
> __dev_set_rx_mode() -> netif_rx_mode_queue(), which re-queues the device
> and retries the sync. The only cost of a failed attempt is one stale
> rx-mode window until the next update; nothing in the kernel relies on
> this attempt succeeding.
>
> Demote to net_err_ratelimited() so the condition stays observable in
> dmesg without tripping panic_on_warn.
>
> Reproducer (syzkaller .prog format with setup notes): 
> https://pastebin.com/t7AQKx9v
>
> Fixes: 3554b4345d85 ("net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work")
> Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
> ---
>  net/core/dev_addr_lists.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
> index d73fcb0c6..c6fdcac74 100644
> --- a/net/core/dev_addr_lists.c
> +++ b/net/core/dev_addr_lists.c
> @@ -1275,7 +1275,8 @@ static void netif_rx_mode_run(struct net_device *dev)
>  		err = netif_addr_lists_snapshot(dev, &uc_snap, &mc_snap,
>  						&uc_ref, &mc_ref);
>  		if (err) {
> -			netdev_WARN(dev, "failed to sync uc/mc addresses\n");
> +			net_err_ratelimited("%s: failed to sync uc/mc addresses\n",
> +					    netdev_name(dev));
>  			netif_addr_unlock_bh(dev);
>  			return;
>  		}

1) I'd go with net_warn_ratelimited() instead. The promoted message
level in syslog might cause operational pain, and as you say the
condition is self-healing. Seems unwarranted.

2) Since it has a Fixes tag, should probably be sumitted for 'net' tree.

3) Was this reported by syzbot? If so, there should also be Reported-by
and Closes tags.

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Zijing yin 5 days, 14 hours ago

Thank you so much for the feedback! I will address these comments in the 
next version. This is not found by syzbot by the way.

Hope you have a great day!

On 19.05.2026 13:22, Jakub Sitnicki wrote:
> On Tue, May 19, 2026 at 02:55 AM -07, Zijing Yin wrote:
>> netif_rx_mode_run() fires netdev_WARN() when netif_addr_lists_snapshot()
>> returns non-zero. The only path to a non-zero return is -ENOMEM from
>> __hw_addr_create() failing its kmalloc(GFP_ATOMIC) -- syzkaller hits it
>> via failslab, and it can also be reached under real memory pressure.
>>
>> This is the only allocator-failure site in this file that WARNs.
>> __hw_addr_create() itself, and every other caller of it in this file
>> (__hw_addr_add_ex(), dev_uc_add_excl(), dev_uc_add(), dev_mc_add(), ...)
>> just propagate the -ENOMEM silently. GFP_ATOMIC is a may-fail allocator;
>> callers are required to handle NULL, so a returned -ENOMEM here is an
>> expected runtime condition, not an invariant violation.
>>
>> The miss is self-healing: any subsequent change to dev->uc / dev->mc
>> (every dev_{uc,mc}_{add,del}, IFF_PROMISC flip, etc.) calls
>> __dev_set_rx_mode() -> netif_rx_mode_queue(), which re-queues the device
>> and retries the sync. The only cost of a failed attempt is one stale
>> rx-mode window until the next update; nothing in the kernel relies on
>> this attempt succeeding.
>>
>> Demote to net_err_ratelimited() so the condition stays observable in
>> dmesg without tripping panic_on_warn.
>>
>> Reproducer (syzkaller .prog format with setup notes): 
>> https://pastebin.com/t7AQKx9v
>>
>> Fixes: 3554b4345d85 ("net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work")
>> Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
>> ---
>>  net/core/dev_addr_lists.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
>> index d73fcb0c6..c6fdcac74 100644
>> --- a/net/core/dev_addr_lists.c
>> +++ b/net/core/dev_addr_lists.c
>> @@ -1275,7 +1275,8 @@ static void netif_rx_mode_run(struct net_device *dev)
>>  		err = netif_addr_lists_snapshot(dev, &uc_snap, &mc_snap,
>>  						&uc_ref, &mc_ref);
>>  		if (err) {
>> -			netdev_WARN(dev, "failed to sync uc/mc addresses\n");
>> +			net_err_ratelimited("%s: failed to sync uc/mc addresses\n",
>> +					    netdev_name(dev));
>>  			netif_addr_unlock_bh(dev);
>>  			return;
>>  		}
> 
> 1) I'd go with net_warn_ratelimited() instead. The promoted message
> level in syslog might cause operational pain, and as you say the
> condition is self-healing. Seems unwarranted.
> 
> 2) Since it has a Fixes tag, should probably be sumitted for 'net' tree.
> 
> 3) Was this reported by syzbot? If so, there should also be Reported-by
> and Closes tags.

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Jiayuan Chen 5 days, 14 hours ago

On Tue, May 19, 2026 at 02:11:53PM +0800, Zijing yin wrote:
> Thank you so much for the feedback! I will address these comments in the 
> next version. This is not found by syzbot by the way.

There has the a syzbot report now:
https://syzkaller.appspot.com/bug?extid=f2421634072a4b47071e

> 
> Hope you have a great day!
> 
> On 19.05.2026 13:22, Jakub Sitnicki wrote:
> > On Tue, May 19, 2026 at 02:55 AM -07, Zijing Yin wrote:
> >> netif_rx_mode_run() fires netdev_WARN() when netif_addr_lists_snapshot()
> >> returns non-zero. The only path to a non-zero return is -ENOMEM from
> >> __hw_addr_create() failing its kmalloc(GFP_ATOMIC) -- syzkaller hits it
> >> via failslab, and it can also be reached under real memory pressure.
> >>
> >> This is the only allocator-failure site in this file that WARNs.
> >> __hw_addr_create() itself, and every other caller of it in this file
> >> (__hw_addr_add_ex(), dev_uc_add_excl(), dev_uc_add(), dev_mc_add(), ...)
> >> just propagate the -ENOMEM silently. GFP_ATOMIC is a may-fail allocator;
> >> callers are required to handle NULL, so a returned -ENOMEM here is an
> >> expected runtime condition, not an invariant violation.
> >>
> >> The miss is self-healing: any subsequent change to dev->uc / dev->mc
> >> (every dev_{uc,mc}_{add,del}, IFF_PROMISC flip, etc.) calls
> >> __dev_set_rx_mode() -> netif_rx_mode_queue(), which re-queues the device
> >> and retries the sync. The only cost of a failed attempt is one stale
> >> rx-mode window until the next update; nothing in the kernel relies on
> >> this attempt succeeding.
> >>
> >> Demote to net_err_ratelimited() so the condition stays observable in
> >> dmesg without tripping panic_on_warn.
> >>
> >> Reproducer (syzkaller .prog format with setup notes): 
> >> https://pastebin.com/t7AQKx9v
> >>
> >> Fixes: 3554b4345d85 ("net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work")
> >> Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
> >> ---
> >>  net/core/dev_addr_lists.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
> >> index d73fcb0c6..c6fdcac74 100644
> >> --- a/net/core/dev_addr_lists.c
> >> +++ b/net/core/dev_addr_lists.c
> >> @@ -1275,7 +1275,8 @@ static void netif_rx_mode_run(struct net_device *dev)
> >>  		err = netif_addr_lists_snapshot(dev, &uc_snap, &mc_snap,
> >>  						&uc_ref, &mc_ref);
> >>  		if (err) {
> >> -			netdev_WARN(dev, "failed to sync uc/mc addresses\n");
> >> +			net_err_ratelimited("%s: failed to sync uc/mc addresses\n",
> >> +					    netdev_name(dev));
> >>  			netif_addr_unlock_bh(dev);
> >>  			return;
> >>  		}
> > 
> > 1) I'd go with net_warn_ratelimited() instead. The promoted message
> > level in syslog might cause operational pain, and as you say the
> > condition is self-healing. Seems unwarranted.
> > 
> > 2) Since it has a Fixes tag, should probably be sumitted for 'net' tree.
> > 
> > 3) Was this reported by syzbot? If so, there should also be Reported-by
> > and Closes tags.
>

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Zijing yin 5 days, 14 hours ago

Thanks! I will attach it accordingly. 

On 19.05.2026 14:22, Jiayuan Chen wrote:
> On Tue, May 19, 2026 at 02:11:53PM +0800, Zijing yin wrote:
>> Thank you so much for the feedback! I will address these comments in the 
>> next version. This is not found by syzbot by the way.
> 
> There has the a syzbot report now:
> https://syzkaller.appspot.com/bug?extid=f2421634072a4b47071e
> 
>>
>> Hope you have a great day!
>>
>> On 19.05.2026 13:22, Jakub Sitnicki wrote:
>>> On Tue, May 19, 2026 at 02:55 AM -07, Zijing Yin wrote:
>>>> netif_rx_mode_run() fires netdev_WARN() when netif_addr_lists_snapshot()
>>>> returns non-zero. The only path to a non-zero return is -ENOMEM from
>>>> __hw_addr_create() failing its kmalloc(GFP_ATOMIC) -- syzkaller hits it
>>>> via failslab, and it can also be reached under real memory pressure.
>>>>
>>>> This is the only allocator-failure site in this file that WARNs.
>>>> __hw_addr_create() itself, and every other caller of it in this file
>>>> (__hw_addr_add_ex(), dev_uc_add_excl(), dev_uc_add(), dev_mc_add(), ...)
>>>> just propagate the -ENOMEM silently. GFP_ATOMIC is a may-fail allocator;
>>>> callers are required to handle NULL, so a returned -ENOMEM here is an
>>>> expected runtime condition, not an invariant violation.
>>>>
>>>> The miss is self-healing: any subsequent change to dev->uc / dev->mc
>>>> (every dev_{uc,mc}_{add,del}, IFF_PROMISC flip, etc.) calls
>>>> __dev_set_rx_mode() -> netif_rx_mode_queue(), which re-queues the device
>>>> and retries the sync. The only cost of a failed attempt is one stale
>>>> rx-mode window until the next update; nothing in the kernel relies on
>>>> this attempt succeeding.
>>>>
>>>> Demote to net_err_ratelimited() so the condition stays observable in
>>>> dmesg without tripping panic_on_warn.
>>>>
>>>> Reproducer (syzkaller .prog format with setup notes): 
>>>> https://pastebin.com/t7AQKx9v
>>>>
>>>> Fixes: 3554b4345d85 ("net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work")
>>>> Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
>>>> ---
>>>>  net/core/dev_addr_lists.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
>>>> index d73fcb0c6..c6fdcac74 100644
>>>> --- a/net/core/dev_addr_lists.c
>>>> +++ b/net/core/dev_addr_lists.c
>>>> @@ -1275,7 +1275,8 @@ static void netif_rx_mode_run(struct net_device *dev)
>>>>  		err = netif_addr_lists_snapshot(dev, &uc_snap, &mc_snap,
>>>>  						&uc_ref, &mc_ref);
>>>>  		if (err) {
>>>> -			netdev_WARN(dev, "failed to sync uc/mc addresses\n");
>>>> +			net_err_ratelimited("%s: failed to sync uc/mc addresses\n",
>>>> +					    netdev_name(dev));
>>>>  			netif_addr_unlock_bh(dev);
>>>>  			return;
>>>>  		}
>>>
>>> 1) I'd go with net_warn_ratelimited() instead. The promoted message
>>> level in syslog might cause operational pain, and as you say the
>>> condition is self-healing. Seems unwarranted.
>>>
>>> 2) Since it has a Fixes tag, should probably be sumitted for 'net' tree.
>>>
>>> 3) Was this reported by syzbot? If so, there should also be Reported-by
>>> and Closes tags.
>>

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Jakub Kicinski 5 days, 4 hours ago

On Tue, 19 May 2026 14:26:55 +0200 Zijing yin wrote:
> Thanks! I will attach it accordingly. 

Please don't top post. Please wait for Stanislav to chime in.
From maintainer's perspective it'd really be preferable if the author
of the code had been given a day or two to fix the bug, rather than
(forgive me) random people (forgive me) feeding the syzbot report 
into an LLM.

My recollection is that we added this WARN to convert the whole thing
into pre-allocation / GFP_KERNEL if it actually hits on real systems.

Re: [PATCH net-next] net: dev_addr_lists: don't WARN on GFP_ATOMIC kmalloc failure in netif_rx_mode_run

Posted by Stanislav Fomichev 5 days, 3 hours ago

On 05/19, Jakub Kicinski wrote:
> On Tue, 19 May 2026 14:26:55 +0200 Zijing yin wrote:
> > Thanks! I will attach it accordingly. 
> 
> Please don't top post. Please wait for Stanislav to chime in.
> From maintainer's perspective it'd really be preferable if the author
> of the code had been given a day or two to fix the bug, rather than
> (forgive me) random people (forgive me) feeding the syzbot report 
> into an LLM.
> 
> My recollection is that we added this WARN to convert the whole thing
> into pre-allocation / GFP_KERNEL if it actually hits on real systems.

I was mainly thinking towards the retry mechanism (which also covers bntx's
BNXT_STATE_L2_FILTER_RETRY), but it looks like pre-allocation is less
controversial :-) Let me sketch something, I think I can keep the same
GFP_ATOMIC in a pre-alloc path and bubble up ENOMEM to the caller..