From: Jiayuan Chen <jiayuan.chen@shopee.com>
bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL
check. rr_tx_counter is a per-CPU counter only allocated in bond_open()
when the bond mode is round-robin. If the bond device was never brought
up, rr_tx_counter remains NULL, causing a null-ptr-deref.
The XDP redirect path can reach this code even when the bond is not up:
bpf_master_redirect_enabled_key is a global static key, so when any bond
device has native XDP attached, the XDP_TX -> xdp_master_redirect()
interception is enabled for all bond slaves system-wide. This allows the
path xdp_master_redirect() -> bond_xdp_get_xmit_slave() ->
bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be
reached on a bond that was never opened.
The normal TX path (bond_xmit_roundrobin) is not affected because TX
requires the bond to be UP, which guarantees rr_tx_counter is allocated.
However, bond_xmit_get_slave() (ndo_get_xmit_slave) has the same code
pattern via bond_xmit_roundrobin_slave_get() and could theoretically
hit the same issue.
Fix this by introducing bond_create_init() to allocate rr_tx_counter
unconditionally at device creation time. It is called from both
bond_create() and bond_newlink() before register_netdevice(), and
returns -ENOMEM on failure so callers can propagate the error cleanly.
bond_setup() is not suitable for this allocation as it is a void
callback with no error return path. The conditional allocation in
bond_open() is removed. Since bond_destructor() already unconditionally
calls free_percpu(bond->rr_tx_counter), the lifecycle is clean:
allocate at creation, free at destruction.
Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave device")
Reported-by: syzbot+80e046b8da2820b6ba73@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/698f84c6.a70a0220.2c38d7.00cc.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
---
drivers/net/bonding/bond_main.c | 18 ++++++++++++------
drivers/net/bonding/bond_netlink.c | 4 ++++
include/net/bonding.h | 1 +
3 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 78cff904cdc3..806034dc301f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4273,18 +4273,18 @@ void bond_work_cancel_all(struct bonding *bond)
cancel_delayed_work_sync(&bond->peer_notify_work);
}
+int bond_create_init(struct bonding *bond)
+{
+ bond->rr_tx_counter = alloc_percpu(u32);
+ return bond->rr_tx_counter ? 0 : -ENOMEM;
+}
+
static int bond_open(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
struct list_head *iter;
struct slave *slave;
- if (BOND_MODE(bond) == BOND_MODE_ROUNDROBIN && !bond->rr_tx_counter) {
- bond->rr_tx_counter = alloc_percpu(u32);
- if (!bond->rr_tx_counter)
- return -ENOMEM;
- }
-
/* reset slave->backup and slave->inactive */
if (bond_has_slaves(bond)) {
bond_for_each_slave(bond, slave, iter) {
@@ -6458,6 +6458,12 @@ int bond_create(struct net *net, const char *name)
dev_net_set(bond_dev, net);
bond_dev->rtnl_link_ops = &bond_link_ops;
+ res = bond_create_init(bond);
+ if (res) {
+ free_netdev(bond_dev);
+ goto out;
+ }
+
res = register_netdevice(bond_dev);
if (res < 0) {
free_netdev(bond_dev);
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index 286f11c517f7..91595df85f06 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -598,6 +598,10 @@ static int bond_newlink(struct net_device *bond_dev,
struct nlattr **tb = params->tb;
int err;
+ err = bond_create_init(bond);
+ if (err)
+ return err;
+
err = register_netdevice(bond_dev);
if (err)
return err;
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 4ad5521e7731..dac4725f3ac0 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -714,6 +714,7 @@ void bond_slave_arr_work_rearm(struct bonding *bond, unsigned long delay);
void bond_peer_notify_work_rearm(struct bonding *bond, unsigned long delay);
void bond_work_init_all(struct bonding *bond);
void bond_work_cancel_all(struct bonding *bond);
+int bond_create_init(struct bonding *bond);
#ifdef CONFIG_PROC_FS
void bond_create_proc_entry(struct bonding *bond);
--
2.43.0
On 2026-02-27 17:22:49 [+0800], Jiayuan Chen wrote: > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -6458,6 +6458,12 @@ int bond_create(struct net *net, const char *name) > dev_net_set(bond_dev, net); > bond_dev->rtnl_link_ops = &bond_link_ops; > > + res = bond_create_init(bond); Wouldn't it be better to put into bond_init()? I haven't look into it but when can the bond_mode be changed? Sebastian
2026/2/27 17:45, "Sebastian Andrzej Siewior" <bigeasy@linutronix.de mailto:bigeasy@linutronix.de?to=%22Sebastian%20Andrzej%20Siewior%22%20%3Cbigeasy%40linutronix.de%3E > wrote: > > On 2026-02-27 17:22:49 [+0800], Jiayuan Chen wrote: > > > > > --- a/drivers/net/bonding/bond_main.c > > +++ b/drivers/net/bonding/bond_main.c > > @@ -6458,6 +6458,12 @@ int bond_create(struct net *net, const char *name) > > dev_net_set(bond_dev, net); > > bond_dev->rtnl_link_ops = &bond_link_ops; > > > > + res = bond_create_init(bond); > > > Wouldn't it be better to put into bond_init()? > > I haven't look into it but when can the bond_mode be changed? > > Sebastian > Thanks! bond_init() (ndo_init) is indeed a better fit, it is called by register_netdevice() and naturally covers both bond_create() and bond_newlink() without a separate helper. bond_mode can be changed after device creation via sysfs or netlink, a bond created in active-backup mode can later be switched to round-robin, which means the allocation must not be conditional on the mode at creation time.
On 2026-02-27 10:17:29 [+0000], Jiayuan Chen wrote: > bond_mode can be changed after device creation via sysfs or netlink, a bond created > in active-backup mode can later be switched to round-robin, which means the allocation > must not be conditional on the mode at creation time. Must the device be in down state or can this be also changed while the device is up? Sebastian
February 27, 2026 at 18:21, "Sebastian Andrzej Siewior" <bigeasy@linutronix.de mailto:bigeasy@linutronix.de?to=%22Sebastian%20Andrzej%20Siewior%22%20%3Cbigeasy%40linutronix.de%3E > wrote: > > On 2026-02-27 10:17:29 [+0000], Jiayuan Chen wrote: > > > > > bond_mode can be changed after device creation via sysfs or netlink, a bond created > > in active-backup mode can later be switched to round-robin, which means the allocation > > must not be conditional on the mode at creation time. > > > Must the device be in down state or can this be also changed while the > device is up? > > Sebastian > The mode change requires the device to be DOWN. BOND_OPT_MODE is defined with BOND_OPTFLAG_IFDOWN, and bond_opt_check_flags() enforces this: if ((opt->flags & BOND_OPTFLAG_IFDOWN) && (bond->dev->flags & IFF_UP)) return -EBUSY; The same restriction applies to the netlink path as well. Both sysfs and netlink go through __bond_opt_set() → bond_opt_check_deps(), which enforces BOND_OPTFLAG_IFDOWN for mode change. Attempting to change the mode while the device is UP returns -EBUSY regardless of how the change is requested. So unconditional allocation in bond_init() covers all cases: whether the device is created in round-robin mode, or switched to round-robin later (which requires being DOWN, meaning bond_open() hasn't been called with the new mode yet). Thanks,
© 2016 - 2026 Red Hat, Inc.