drivers/net/bonding/bond_main.c | 14 +++++---- drivers/net/net_failover.c | 67 +++++------------------------------------ drivers/net/team/team_core.c | 15 ++++----- include/net/net_failover.h | 7 ----- net/8021q/vlan.c | 2 -- net/bridge/br_device.c | 7 +++++ net/bridge/br_if.c | 6 ---- net/core/dev.c | 8 +++-- net/hsr/hsr_slave.c | 1 - 9 files changed, 37 insertions(+), 90 deletions(-)
Currently, master devices (bonding, bridge, team) manually call
netdev_compute_master_upper_features() scattered throughout their port
add/remove operations. This approach requires each driver to remember
to update features at the right times and leads to code duplication.
The series moves netdev_compute_master_upper_features() to callback
ndo_set_features so that the offload compute could automatically
invoked during feature updates when upper/lower device relationships
change. This centralizes the feature computation flow and removes the
burden from individual drivers.
---
Hangbin Liu (3):
net: use ndo_set_features to set offload features for bonding/bridge/team
failover: use ndo_set_features for failover offload compute
net: no need to disable LRO specifically
drivers/net/bonding/bond_main.c | 14 +++++----
drivers/net/net_failover.c | 67 +++++------------------------------------
drivers/net/team/team_core.c | 15 ++++-----
include/net/net_failover.h | 7 -----
net/8021q/vlan.c | 2 --
net/bridge/br_device.c | 7 +++++
net/bridge/br_if.c | 6 ----
net/core/dev.c | 8 +++--
net/hsr/hsr_slave.c | 1 -
9 files changed, 37 insertions(+), 90 deletions(-)
---
base-commit: 52ede1bce557c66309f41ac29dd190be23ca9129
change-id: 20260310-offload_compute-4c0bafa2e022
Best regards,
--
Hangbin Liu <liuhangbin@gmail.com>
On 3/10/26 8:45 AM, Hangbin Liu wrote: > Currently, master devices (bonding, bridge, team) manually call > netdev_compute_master_upper_features() scattered throughout their port > add/remove operations. This approach requires each driver to remember > to update features at the right times and leads to code duplication. > > The series moves netdev_compute_master_upper_features() to callback > ndo_set_features so that the offload compute could automatically > invoked during feature updates when upper/lower device relationships > change. This centralizes the feature computation flow and removes the > burden from individual drivers. > > --- > Hangbin Liu (3): > net: use ndo_set_features to set offload features for bonding/bridge/team > failover: use ndo_set_features for failover offload compute > net: no need to disable LRO specifically > > drivers/net/bonding/bond_main.c | 14 +++++---- > drivers/net/net_failover.c | 67 +++++------------------------------------ > drivers/net/team/team_core.c | 15 ++++----- > include/net/net_failover.h | 7 ----- > net/8021q/vlan.c | 2 -- > net/bridge/br_device.c | 7 +++++ > net/bridge/br_if.c | 6 ---- > net/core/dev.c | 8 +++-- > net/hsr/hsr_slave.c | 1 - > 9 files changed, 37 insertions(+), 90 deletions(-) I'm dropping this series from PW due to the self-tests failures. I guess it could be restored later if it turns out the real problem is elsewhere. /P
syzbot ci has tested the following series [v1] net: move netdev_compute_master_upper_features to ndo_set_features https://lore.kernel.org/all/20260310-offload_compute-v1-0-3df79c09ea65@gmail.com * [PATCH net-next 1/3] net: use ndo_set_features to set offload features for bonding/bridge/team * [PATCH net-next 2/3] failover: use ndo_set_features for failover offload compute * [PATCH net-next 3/3] net: no need to disable LRO specifically and found the following issue: WARNING in rtmsg_ifinfo_build_skb Full report is available here: https://ci.syzbot.org/series/d5001c4a-a51e-49c1-9106-624836e43ec2 *** WARNING in rtmsg_ifinfo_build_skb tree: net-next URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git base: 52ede1bce557c66309f41ac29dd190be23ca9129 arch: amd64 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 config: https://ci.syzbot.org/builds/281e7e02-f6af-4b4a-845e-d1d5842a9301/config batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active hsr_slave_0: entered promiscuous mode hsr_slave_1: entered promiscuous mode ------------[ cut here ]------------ err == -EMSGSIZE WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496 Modules linked in: CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260 Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89 RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293 RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0 RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6 RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004 R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21 R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000 FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0 Call Trace: <TASK> rtnetlink_event+0x1b7/0x270 notifier_call_chain+0x1be/0x400 netdev_change_features+0x95/0xe0 __netdev_upper_dev_link+0xb20/0xc80 netdev_upper_dev_link+0xb0/0x100 macsec_newlink+0xb11/0x1200 rtnl_newlink_create+0x329/0xb70 rtnl_newlink+0x1666/0x1be0 rtnetlink_rcv_msg+0x7d5/0xbe0 netlink_rcv_skb+0x232/0x4b0 netlink_unicast+0x80f/0x9b0 netlink_sendmsg+0x813/0xb40 __sys_sendto+0x672/0x710 __x64_sys_sendto+0xde/0x100 do_syscall_64+0x14d/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ffadf757917 Code: 48 89 fa 4c 89 df e8 a8 56 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff RSP: 002b:00007ffd0eb79f80 EFLAGS: 00000202 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000555557c4a500 RCX: 00007ffadf757917 RDX: 0000000000000054 RSI: 00007ffae0544670 RDI: 0000000000000003 RBP: 0000000000000001 R08: 00007ffd0eb79fe4 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003 R13: 0000000000000000 R14: 00007ffae0544670 R15: 0000000000000000 </TASK> *** If these findings have caused you to resend the series or submit a separate fix, please add the following tag to your commit message: Tested-by: syzbot@syzkaller.appspotmail.com --- This report is generated by a bot. It may contain errors. syzbot ci engineers can be reached at syzkaller@googlegroups.com.
2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> hsr_slave_0: entered promiscuous mode
> hsr_slave_1: entered promiscuous mode
> ------------[ cut here ]------------
> err == -EMSGSIZE
> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
I'm not sure this one is caused by this series, but either way,
reviewing if_nlmsg_size/rtnl_fill_ifinfo for mismatches is really
unpleasant :/
Things I see in rtnl_fill_ifinfo but don't find in if_nlmsg_size:
- IFLA_PARENT_DEV_NAME
- IFLA_PARENT_DEV_BUS_NAME
(both from 00e77ed8e64d ("rtnetlink: add
IFLA_PARENT_[DEV|DEV_BUS]_NAME"), which doesn't include a change to
if_nlmsg_size)
- rtnl_link_slave_info_fill also outputs IFLA_INFO_SLAVE_KIND + the
IFLA_INFO_SLAVE_DATA nest, but rtnl_link_get_slave_info_data_size
only counts the nest, and its caller (rtnl_link_get_size) doesn't
have anything more about the slave info. This may be what syzbot is
tripping on here.
But there's a
+ nla_total_size(4) /* IFLA_WEIGHT */
that doesn't get filled anywhere.
> Modules linked in:
> CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260
> Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89
> RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293
> RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0
> RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6
> RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004
> R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21
> R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000
> FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> rtnetlink_event+0x1b7/0x270
> notifier_call_chain+0x1be/0x400
> netdev_change_features+0x95/0xe0
> __netdev_upper_dev_link+0xb20/0xc80
> netdev_upper_dev_link+0xb0/0x100
> macsec_newlink+0xb11/0x1200
> rtnl_newlink_create+0x329/0xb70
> rtnl_newlink+0x1666/0x1be0
--
Sabrina
On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
> 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> > batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> > hsr_slave_0: entered promiscuous mode
> > hsr_slave_1: entered promiscuous mode
> > ------------[ cut here ]------------
> > err == -EMSGSIZE
> > WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
>
> I'm not sure this one is caused by this series, but either way,
rtnetlink_event+0x1b7/0x270
notifier_call_chain+0x1be/0x400
netdev_change_features+0x95/0xe0
__netdev_upper_dev_link+0xb20/0xc80
netdev_upper_dev_link+0xb0/0x100
This patch calls netdev_change_features() after __netdev_upper_dev_link(),
Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
to fill the new link info. Maybe the event is a bit early and macsec has
data not ready?
Thanks
Hangbin
> reviewing if_nlmsg_size/rtnl_fill_ifinfo for mismatches is really
> unpleasant :/
>
> Things I see in rtnl_fill_ifinfo but don't find in if_nlmsg_size:
> - IFLA_PARENT_DEV_NAME
> - IFLA_PARENT_DEV_BUS_NAME
> (both from 00e77ed8e64d ("rtnetlink: add
> IFLA_PARENT_[DEV|DEV_BUS]_NAME"), which doesn't include a change to
> if_nlmsg_size)
> - rtnl_link_slave_info_fill also outputs IFLA_INFO_SLAVE_KIND + the
> IFLA_INFO_SLAVE_DATA nest, but rtnl_link_get_slave_info_data_size
> only counts the nest, and its caller (rtnl_link_get_size) doesn't
> have anything more about the slave info. This may be what syzbot is
> tripping on here.
>
>
> But there's a
>
> + nla_total_size(4) /* IFLA_WEIGHT */
>
> that doesn't get filled anywhere.
>
>
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 6496 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > RIP: 0010:rtmsg_ifinfo_build_skb+0x218/0x260
> > Code: f6 ba 01 00 00 00 89 e9 e8 25 15 3a 00 4c 89 f0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d e9 7f 3a 2e 02 cc e8 49 3b 42 f8 90 <0f> 0b 90 eb 90 89 d9 80 e1 07 fe c1 38 c1 0f 8c 95 fe ff ff 48 89
> > RSP: 0018:ffffc9000637e9a0 EFLAGS: 00010293
> > RAX: ffffffff89835e27 RBX: 0000000000000000 RCX: ffff8881b80a57c0
> > RDX: 0000000000000000 RSI: 00000000ffffffa6 RDI: 00000000ffffffa6
> > RBP: 00000000ffffffa6 R08: 0000000000000004 R09: 0000000000000004
> > R10: fffff52000c6fcdc R11: 0000000000000000 R12: 1ffff110235ddc21
> > R13: 0000000000000000 R14: ffff8881133dc780 R15: ffff88811aeee000
> > FS: 0000555557c4a500(0000) GS:ffff88818de65000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000055555e1838c8 CR3: 0000000168b80000 CR4: 00000000000006f0
> > Call Trace:
> > <TASK>
> > rtnetlink_event+0x1b7/0x270
> > notifier_call_chain+0x1be/0x400
> > netdev_change_features+0x95/0xe0
> > __netdev_upper_dev_link+0xb20/0xc80
> > netdev_upper_dev_link+0xb0/0x100
> > macsec_newlink+0xb11/0x1200
> > rtnl_newlink_create+0x329/0xb70
> > rtnl_newlink+0x1666/0x1be0
>
> --
> Sabrina
2026-03-11, 00:47:41 +0000, Hangbin Liu wrote: > On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote: > > 2026-03-10, 10:02:09 -0700, syzbot ci wrote: > > > batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active > > > hsr_slave_0: entered promiscuous mode > > > hsr_slave_1: entered promiscuous mode > > > ------------[ cut here ]------------ > > > err == -EMSGSIZE > > > WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496 > > > > I'm not sure this one is caused by this series, but either way, > > > rtnetlink_event+0x1b7/0x270 > notifier_call_chain+0x1be/0x400 > netdev_change_features+0x95/0xe0 > __netdev_upper_dev_link+0xb20/0xc80 > netdev_upper_dev_link+0xb0/0x100 > > > This patch calls netdev_change_features() after __netdev_upper_dev_link(), > Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event() > to fill the new link info. Maybe the event is a bit early and macsec has > data not ready? But this would still mean that there's a mismatch between if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only revealing it. I'll send fixes for the stuff I mentioned, no idea if that's what syzbot saw since we don't have a repro. -- Sabrina
On 3/11/26 10:18 PM, Sabrina Dubroca wrote: > 2026-03-11, 00:47:41 +0000, Hangbin Liu wrote: >> On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote: >>> 2026-03-10, 10:02:09 -0700, syzbot ci wrote: >>>> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active >>>> hsr_slave_0: entered promiscuous mode >>>> hsr_slave_1: entered promiscuous mode >>>> ------------[ cut here ]------------ >>>> err == -EMSGSIZE >>>> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496 >>> >>> I'm not sure this one is caused by this series, but either way, >> >> >> rtnetlink_event+0x1b7/0x270 >> notifier_call_chain+0x1be/0x400 >> netdev_change_features+0x95/0xe0 >> __netdev_upper_dev_link+0xb20/0xc80 >> netdev_upper_dev_link+0xb0/0x100 >> >> >> This patch calls netdev_change_features() after __netdev_upper_dev_link(), >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event() >> to fill the new link info. Maybe the event is a bit early and macsec has >> data not ready? > > But this would still mean that there's a mismatch between > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only > revealing it. > > I'll send fixes for the stuff I mentioned, no idea if that's what > syzbot saw since we don't have a repro. It looks like even the nipa CI is reproducing the issue, i.e.: https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/ more failures here: https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0 the fail in mascsec-offload looks quite deterministic, could you please have a look? /P
2026-03-12, 10:40:43 +0100, Paolo Abeni wrote:
> On 3/11/26 10:18 PM, Sabrina Dubroca wrote:
> > 2026-03-11, 00:47:41 +0000, Hangbin Liu wrote:
> >> On Tue, Mar 10, 2026 at 08:17:01PM +0100, Sabrina Dubroca wrote:
> >>> 2026-03-10, 10:02:09 -0700, syzbot ci wrote:
> >>>> batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
> >>>> hsr_slave_0: entered promiscuous mode
> >>>> hsr_slave_1: entered promiscuous mode
> >>>> ------------[ cut here ]------------
> >>>> err == -EMSGSIZE
> >>>> WARNING: net/core/rtnetlink.c:4421 at rtmsg_ifinfo_build_skb+0x218/0x260, CPU#0: syz-executor/6496
> >>>
> >>> I'm not sure this one is caused by this series, but either way,
> >>
> >>
> >> rtnetlink_event+0x1b7/0x270
> >> notifier_call_chain+0x1be/0x400
> >> netdev_change_features+0x95/0xe0
> >> __netdev_upper_dev_link+0xb20/0xc80
> >> netdev_upper_dev_link+0xb0/0x100
> >>
> >>
> >> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
> >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> >> to fill the new link info. Maybe the event is a bit early and macsec has
> >> data not ready?
> >
> > But this would still mean that there's a mismatch between
> > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > revealing it.
> >
> > I'll send fixes for the stuff I mentioned, no idea if that's what
> > syzbot saw since we don't have a repro.
>
> It looks like even the nipa CI is reproducing the issue, i.e.:
>
> https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
>
> more failures here:
>
> https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
>
> the fail in mascsec-offload looks quite deterministic, could you please
> have a look?
Ah crap, sorry Hangbin, you were right. macsec_fill_info() returns
-EMSGSIZE when the key length is unexpected, and at this point we
haven't set it to its proper value yet.
Bandaid solution would be:
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..0f7ef7bfbdde 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : MACSEC_CIPHER_ID_GCM_AES_256;
break;
default:
- goto nla_put_failure;
+ return 0;
}
if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
Proper fix (so that the notification we're sending during
upper_dev_link has full linkinfo) would be to move
netdev_upper_dev_link() to after macsec_changelink_common() and fix up
the error handling. I don't see anything in macsec_add_dev or
macsec_changelink_common that needs the device to be linked. But
anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
on invalid data, so the "bandaid" should be included as well.
Should this be part of this series (either just the "bandaid" or the
"proper fix"+bandaid), since we never saw a problem before?
--
Sabrina
On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > >> This patch calls netdev_change_features() after __netdev_upper_dev_link(),
> > >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> > >> to fill the new link info. Maybe the event is a bit early and macsec has
> > >> data not ready?
> > >
> > > But this would still mean that there's a mismatch between
> > > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > > revealing it.
> > >
> > > I'll send fixes for the stuff I mentioned, no idea if that's what
> > > syzbot saw since we don't have a repro.
> >
> > It looks like even the nipa CI is reproducing the issue, i.e.:
> >
> > https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
> >
> > more failures here:
> >
> > https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
> >
> > the fail in mascsec-offload looks quite deterministic, could you please
> > have a look?
>
> Ah crap, sorry Hangbin, you were right. macsec_fill_info() returns
> -EMSGSIZE when the key length is unexpected, and at this point we
> haven't set it to its proper value yet.
>
> Bandaid solution would be:
>
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..0f7ef7bfbdde 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
> csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : MACSEC_CIPHER_ID_GCM_AES_256;
> break;
> default:
> - goto nla_put_failure;
> + return 0;
> }
>
> if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
>
>
> Proper fix (so that the notification we're sending during
> upper_dev_link has full linkinfo) would be to move
> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> the error handling. I don't see anything in macsec_add_dev or
> macsec_changelink_common that needs the device to be linked. But
If we move the netdev_upper_dev_link() after macsec_changelink_common(),
we will not goto nla_put_failure via default, right?
> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> on invalid data, so the "bandaid" should be included as well.
>
> Should this be part of this series (either just the "bandaid" or the
> "proper fix"+bandaid), since we never saw a problem before?
Since macsec need the "bandaid" fix either way. How about you post the
"bandaid" fix to net. And I include the "proper fix" in this series for
net-next?
BTW, here is my draft patch, would you like to review it first?
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..1f8da4c291c6 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
lockdep_set_class(&dev->addr_list_lock,
&macsec_netdev_addr_lock_key);
- err = netdev_upper_dev_link(real_dev, dev, extack);
- if (err < 0)
- goto unregister;
-
/* need to be already registered so that ->init has run and
* the MAC addr is set
*/
@@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
if (rx_handler && sci_exists(real_dev, sci)) {
err = -EBUSY;
- goto unlink;
+ goto unregister;
}
err = macsec_add_dev(dev, sci, icv_len);
if (err)
- goto unlink;
+ goto unregister;
if (data) {
err = macsec_changelink_common(dev, data);
@@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
goto del_dev;
}
+ err = netdev_upper_dev_link(real_dev, dev, extack);
+ if (err < 0)
+ goto unlink;
+
/* If h/w offloading is available, propagate to the device */
if (macsec_is_offloaded(macsec)) {
const struct macsec_ops *ops;
@@ -4200,7 +4200,7 @@ static int macsec_newlink(struct net_device *dev,
ctx.secy = &macsec->secy;
err = macsec_offload(ops->mdo_add_secy, &ctx);
if (err)
- goto del_dev;
+ goto unlink;
macsec->insert_tx_tag =
macsec_needs_tx_tag(macsec, ops);
@@ -4209,7 +4209,7 @@ static int macsec_newlink(struct net_device *dev,
err = register_macsec_dev(real_dev, dev);
if (err < 0)
- goto del_dev;
+ goto unlink;
netdev_update_features(dev);
netif_stacked_transfer_operstate(real_dev, dev);
@@ -4219,10 +4219,10 @@ static int macsec_newlink(struct net_device *dev,
return 0;
-del_dev:
- macsec_del_dev(macsec);
unlink:
netdev_upper_dev_unlink(real_dev, dev);
+del_dev:
+ macsec_del_dev(macsec);
unregister:
unregister_netdevice(dev);
return err;
Thanks
Hangbin
2026-03-12, 14:34:44 +0000, Hangbin Liu wrote:
> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > Proper fix (so that the notification we're sending during
> > upper_dev_link has full linkinfo) would be to move
> > netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> > the error handling. I don't see anything in macsec_add_dev or
> > macsec_changelink_common that needs the device to be linked. But
>
> If we move the netdev_upper_dev_link() after macsec_changelink_common(),
> we will not goto nla_put_failure via default, right?
Yes.
> > anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> > on invalid data, so the "bandaid" should be included as well.
> >
> > Should this be part of this series (either just the "bandaid" or the
> > "proper fix"+bandaid), since we never saw a problem before?
>
> Since macsec need the "bandaid" fix either way. How about you post the
> "bandaid" fix to net. And I include the "proper fix" in this series for
> net-next?
But I don't think it's needed in net. Am I missing a codepath (before
your series) where macsec_fill_info could be called for the new device
before macsec_newlink returns? If not, it doesn't really qualify as a
fix, that's why I was asking Paolo.
> BTW, here is my draft patch, would you like to review it first?
>
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..1f8da4c291c6 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
> lockdep_set_class(&dev->addr_list_lock,
> &macsec_netdev_addr_lock_key);
>
> - err = netdev_upper_dev_link(real_dev, dev, extack);
> - if (err < 0)
> - goto unregister;
> -
> /* need to be already registered so that ->init has run and
> * the MAC addr is set
> */
> @@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
>
> if (rx_handler && sci_exists(real_dev, sci)) {
> err = -EBUSY;
> - goto unlink;
> + goto unregister;
> }
>
> err = macsec_add_dev(dev, sci, icv_len);
> if (err)
> - goto unlink;
> + goto unregister;
>
> if (data) {
> err = macsec_changelink_common(dev, data);
> @@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
> goto del_dev;
> }
>
> + err = netdev_upper_dev_link(real_dev, dev, extack);
> + if (err < 0)
> + goto unlink;
This one shouldn't be "goto unlink" since we haven't linked yet. I'm
not sure netdev_upper_dev_unlink can handle "not linked yet" devices.
Otherwise this looks ok.
--
Sabrina
On 3/12/26 4:58 PM, Sabrina Dubroca wrote: > 2026-03-12, 14:34:44 +0000, Hangbin Liu wrote: >> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote: >>> Proper fix (so that the notification we're sending during >>> upper_dev_link has full linkinfo) would be to move >>> netdev_upper_dev_link() to after macsec_changelink_common() and fix up >>> the error handling. I don't see anything in macsec_add_dev or >>> macsec_changelink_common that needs the device to be linked. But >> >> If we move the netdev_upper_dev_link() after macsec_changelink_common(), >> we will not goto nla_put_failure via default, right? > > Yes. > >>> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE >>> on invalid data, so the "bandaid" should be included as well. >>> >>> Should this be part of this series (either just the "bandaid" or the >>> "proper fix"+bandaid), since we never saw a problem before? >> >> Since macsec need the "bandaid" fix either way. How about you post the >> "bandaid" fix to net. And I include the "proper fix" in this series for >> net-next? > > But I don't think it's needed in net. Am I missing a codepath (before > your series) where macsec_fill_info could be called for the new device > before macsec_newlink returns? If not, it doesn't really qualify as a > fix, that's why I was asking Paolo. FWIW, I don't see a codepath calling into rtmsg_ifinfo_build_skb() before device initialization, so I would be fine targeting net-next with the EMSGSIZE-related change. Side note, it looks like that the WARN() in the rnnetlink code here helped identifying a real problem and correctly returning 0 when the key_len is not yet initialized will silence it forever, what about preserving a warning for this kind of race? something alike: --- diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index f6cad0746a02..82974d4fa3f6 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4337,7 +4337,8 @@ static int macsec_fill_info(struct sk_buff *skb, csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : MACSEC_CIPHER_ID_GCM_AES_256; break; default: - goto nla_put_failure; + WARN_ON_ONCE(1); + return 0; }
2026-03-12, 17:47:45 +0100, Paolo Abeni wrote: > > > On 3/12/26 4:58 PM, Sabrina Dubroca wrote: > > 2026-03-12, 14:34:44 +0000, Hangbin Liu wrote: > >> On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote: > >>> Proper fix (so that the notification we're sending during > >>> upper_dev_link has full linkinfo) would be to move > >>> netdev_upper_dev_link() to after macsec_changelink_common() and fix up > >>> the error handling. I don't see anything in macsec_add_dev or > >>> macsec_changelink_common that needs the device to be linked. But > >> > >> If we move the netdev_upper_dev_link() after macsec_changelink_common(), > >> we will not goto nla_put_failure via default, right? > > > > Yes. > > > >>> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE > >>> on invalid data, so the "bandaid" should be included as well. > >>> > >>> Should this be part of this series (either just the "bandaid" or the > >>> "proper fix"+bandaid), since we never saw a problem before? > >> > >> Since macsec need the "bandaid" fix either way. How about you post the > >> "bandaid" fix to net. And I include the "proper fix" in this series for > >> net-next? > > > > But I don't think it's needed in net. Am I missing a codepath (before > > your series) where macsec_fill_info could be called for the new device > > before macsec_newlink returns? If not, it doesn't really qualify as a > > fix, that's why I was asking Paolo. > > FWIW, I don't see a codepath calling into rtmsg_ifinfo_build_skb() > before device initialization, so I would be fine targeting net-next with > the EMSGSIZE-related change. > > Side note, it looks like that the WARN() in the rnnetlink code here > helped identifying a real problem and correctly returning 0 when the > key_len is not yet initialized will silence it forever, what about > preserving a warning for this kind of race? something alike: Right, I was thinking about putting a DEBUG_NET_WARN_ON_ONCE there. -- Sabrina
© 2016 - 2026 Red Hat, Inc.