From nobody Sun Feb 8 10:44:13 2026 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E353314A74 for ; Fri, 2 Jan 2026 18:05:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767377163; cv=none; b=P8tJNKf5GkwFha7/uoMNlHlEQ3gxHvjVwMWpsvVJ3IBuqtW+OHc6NQDKlsBvJdhl/372EvzZaf5fY2hSIbd9tw8xsqtf8p7D1/vYjJmd0JD7ZOv9z6VbDiqI7VxFt3d0XWgugJcZsSvUy0moN1sPS+CSRujX6CAG0kIHoUPj+W8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767377163; c=relaxed/simple; bh=QL6iWH/ffOsWWkSiflvPSq6EGb10KY9jnnnWVhluE6A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Uk2SHPUUoxjKwUFTvbucLWUhU7RctPIkFyZFI7h2S8aS6c0g9tafhqnO8mDtr+ldywnzSoGO7QCctJVfJvYClehR7rXj+o2XqAxPnBHCpXJghmJdeQqY/neIt6AOhpqS2E1Vi23nnVvGv4W+0IqwLtWa4fLnBG6KSzMndwjI4YY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=G7X9KnDf; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="G7X9KnDf" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-2a0d6f647e2so198876935ad.1 for ; Fri, 02 Jan 2026 10:05:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767377154; x=1767981954; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rDg1l5xXQWfPdXgWqgA3SiOxZopZjh7i4jGImOR73AU=; b=G7X9KnDfa07/+Dt/G0BY3jkRcHfUWEWlp7EwGnFsBKhBokwlAo4aH0Z6OK6jI0Ylar dsi6PLrmnDb1Qd5XPafwYVwXOFI0acMx53qtx7f0MQNRd0nqffExr/q29VsMQ+T+r0Hh ozSgOzrffHMkX12yHdwCMIaNvcDE+HsNMHUNxX64Q0esC6+BhTQoBfYopdcnAj69FHLS Zr8ocFBVmdYGp9aE5AUn5yU6/SjoWOWdWXqtLZhzT5cglkYil8HXpULgPj1OafOepwY+ Bvw01304w/jvQGnRGnU264eRSXmEcxod78iBpNUj4UvBsP8FZ6fbAWfd2JOOdfhmpGIx bL8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767377154; x=1767981954; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=rDg1l5xXQWfPdXgWqgA3SiOxZopZjh7i4jGImOR73AU=; b=SfkE1B8LbT2sXt3TzcemlI4kRzhActIq/ui89fPfOy9zpUoTHyP3KDDpD2bxutTEUG 1a/hk82f5On8M+g8wlr9/F5tac0+tqpcEX/eM8p65sseksIjTnEsFx9i3Tn2JCMSrfzg o7gWx+9XGcEA9HtPhbFrD0poepuSIU4b5oKYDvrKS3he6a27FHXioLdoHxd417UkGv8m TubAm2JiYuw6rb2ukHXWWUvZVKak/00OewB3f42s0iCmrc99/8FxzA6Dm2E8bSvUtewC rrzygeJNma6rvuTBh8EvB9nTcy5QOw8+AsGFbbcxHvqIeJI5puEpbZemFShIMDtJ5Oro Sqzg== X-Forwarded-Encrypted: i=1; AJvYcCXGkIyIjJVXHcIubZgb3dJhKkB6G6zsUkOSP9RHHYEhCWJ0VvLHo89PtrTjZ23CyGuqvPcEn0m303/+RMc=@vger.kernel.org X-Gm-Message-State: AOJu0YyJmk+e1IyOmBUY0CPdd06JUUG48HMb0fq4+DBvbIjKdS2Tz2j6 U4XukUyxvDo8f4NQ6wc7z2gWQB6x45fay/9en3hxn4qtlyfKcg6utX4/pURYeQ== X-Gm-Gg: AY/fxX456LiO8zaQ+50LZK1IESGLH0AShIF2duiYN5C6hBo4iYlKU+AAFFrfuhHKJMf z/UBkOc/Gr8yneOooPELAu3+2YOLaZjqOEw3pYRscTzcJ5usy8gEcSzmVVzOylbI9aZ8zaErOdE 1yFZisbcONM9LNpO/dWgt0YykDeTW/UBOYukpNKEH6JgbHDp04LoTEywhacxToezVLtr6BNvC2E 9/SRh7g4/atftJroMeuEO9SjQaeipICfiJyiP3uCcLWc33bOsSkY2Remle5Dxi9iZ2/GQ5zI+F7 OzozGzQChPOkYNwdfMlYjRhpiU4dM1CN+qNc2/oU7sQ2PrKQt9Hj+7xBLYxei6ffw5qQP1UXBWW rkGSxDV0K0tbnbzePZ/IIqw0zc87ah2aTXQ1KSngzPcQiQfeUxwM7OeNAtiCuz3kRpQD9LnfqGP 3gwizzkiZwxm61Xt8lpc1XsAcjnQaWwRKCwfdb89GMglBiiUlr5OxYu3rSJB2bo2E0D08= X-Google-Smtp-Source: AGHT+IEuuXrn9NaOqlNxXjs2ojC/SR61NEtcz1zOTJnY/wbx1biNgLIT0Xg5mI+ZNFeJfAokXBWluQ== X-Received: by 2002:a17:903:290:b0:2a1:e19:ff5 with SMTP id d9443c01a7336-2a2f273818fmr464251945ad.38.1767377154228; Fri, 02 Jan 2026 10:05:54 -0800 (PST) Received: from localhost.localdomain ([223.181.108.198]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a2f3d77566sm386297585ad.97.2026.01.02.10.05.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Jan 2026 10:05:53 -0800 (PST) From: I Viswanath To: edumazet@google.com, andrew+netdev@lunn.ch, horms@kernel.org, kuba@kernel.org, pabeni@redhat.com, mst@redhat.com, eperezma@redhat.com, jasowang@redhat.com, xuanzhuo@linux.alibaba.com Cc: netdev@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, I Viswanath Subject: [PATCH net-next v7 1/2] net: refactor set_rx_mode into snapshot and deferred I/O Date: Fri, 2 Jan 2026 23:35:29 +0530 Message-ID: <20260102180530.1559514-2-viswanathiyyappan@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260102180530.1559514-1-viswanathiyyappan@gmail.com> References: <20260102180530.1559514-1-viswanathiyyappan@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" ndo_set_rx_mode is problematic as it cannot sleep. There are drivers that circumvent this by doing the rx_mode work in a work item. This requires extra work that can be avoided if core provided a mechanism to do that. This patch proposes such a mechanism. Refactor set_rx_mode into 2 stages: A snapshot stage and the actual I/O. In this new model, when _dev_set_rx_mode is called, we take a snapshot of the current rx_config and then commit it to the hardware later via a work item To accomplish this, reinterpret set_rx_mode as the ndo for customizing the snapshot and enabling/disabling rx_mode set and add a new ndo write_rx_mode for the deferred I/O Suggested-by: Jakub Kicinski Signed-off-by: I Viswanath --- include/linux/netdevice.h | 111 +++++++++++++++- net/core/dev.c | 264 +++++++++++++++++++++++++++++++++++++- 2 files changed, 368 insertions(+), 7 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5870a9e514a5..210f320d404d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1062,6 +1062,44 @@ struct netdev_net_notifier { struct notifier_block *nb; }; =20 +struct netif_cleanup_work { + struct work_struct work; + struct net_device *dev; +}; + +enum netif_rx_mode_cfg { + NETIF_RX_MODE_CFG_ALLMULTI, + NETIF_RX_MODE_CFG_PROMISC, + NETIF_RX_MODE_CFG_VLAN +}; + +enum netif_rx_mode_flags { + NETIF_RX_MODE_READY, + + /* if set, rx_mode set work will be skipped */ + NETIF_RX_MODE_SET_SKIP, + + /* if set, uc/mc lists will not be part of rx_mode config */ + NETIF_RX_MODE_UC_SKIP, + NETIF_RX_MODE_MC_SKIP +}; + +struct netif_rx_mode_config { + char *uc_addrs; + char *mc_addrs; + int uc_count; + int mc_count; + int cfg; +}; + +struct netif_rx_mode_ctx { + struct netif_rx_mode_config *pending; + struct netif_rx_mode_config *ready; + struct work_struct work; + struct net_device *dev; + int flags; +}; + /* * This structure defines the management hooks for network devices. * The following hooks can be defined; unless noted otherwise, they are @@ -1114,9 +1152,14 @@ struct netdev_net_notifier { * changes to configuration when multicast or promiscuous is enabled. * * void (*ndo_set_rx_mode)(struct net_device *dev); - * This function is called device changes address list filtering. + * This function is called when device changes address list filtering. * If driver handles unicast address filtering, it should set - * IFF_UNICAST_FLT in its priv_flags. + * IFF_UNICAST_FLT in its priv_flags. This is used to configure + * the rx_mode snapshot that will be written to the hardware. + * + * void (*ndo_write_rx_mode)(struct net_device *dev); + * This function is scheduled after set_rx_mode and is responsible for + * writing the rx_mode snapshot to the hardware. * * int (*ndo_set_mac_address)(struct net_device *dev, void *addr); * This function is called when the Media Access Control address @@ -1437,6 +1480,7 @@ struct net_device_ops { void (*ndo_change_rx_flags)(struct net_device *dev, int flags); void (*ndo_set_rx_mode)(struct net_device *dev); + void (*ndo_write_rx_mode)(struct net_device *dev); int (*ndo_set_mac_address)(struct net_device *dev, void *addr); int (*ndo_validate_addr)(struct net_device *dev); @@ -1939,7 +1983,7 @@ enum netdev_reg_state { * @ingress_queue: XXX: need comments on this one * @nf_hooks_ingress: netfilter hooks executed for ingress packets * @broadcast: hw bcast address - * + * @rx_mode_ctx: rx_mode work context * @rx_cpu_rmap: CPU reverse-mapping for RX completion interrupts, * indexed by RX queue number. Assigned by driver. * This must only be set if the ndo_rx_flow_steer @@ -1971,6 +2015,8 @@ enum netdev_reg_state { * @link_watch_list: XXX: need comments on this one * * @reg_state: Register/unregister state machine + * @needs_cleanup_work: Should dev_close schedule the cleanup work? + * @cleanup_work: Cleanup work context * @dismantle: Device is going to be freed * @needs_free_netdev: Should unregister perform free_netdev? * @priv_destructor: Called from unregister @@ -2350,6 +2396,7 @@ struct net_device { #endif =20 unsigned char broadcast[MAX_ADDR_LEN]; + struct netif_rx_mode_ctx *rx_mode_ctx; #ifdef CONFIG_RFS_ACCEL struct cpu_rmap *rx_cpu_rmap; #endif @@ -2387,6 +2434,10 @@ struct net_device { =20 u8 reg_state; =20 + bool needs_cleanup_work; + + struct netif_cleanup_work *cleanup_work; + bool dismantle; =20 /** @moving_ns: device is changing netns, protected by @lock */ @@ -3373,6 +3424,60 @@ int dev_loopback_xmit(struct net *net, struct sock *= sk, struct sk_buff *newskb); u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); =20 +/* Helpers to be used in the set_rx_mode implementation */ +static inline void netif_rx_mode_set_cfg(struct net_device *dev, int b, + bool val) +{ + if (val) + dev->rx_mode_ctx->pending->cfg |=3D BIT(b); + else + dev->rx_mode_ctx->pending->cfg &=3D ~BIT(b); +} + +static inline void netif_rx_mode_set_flag(struct net_device *dev, int b, + bool val) +{ + if (val) + dev->rx_mode_ctx->flags |=3D BIT(b); + else + dev->rx_mode_ctx->flags &=3D ~BIT(b); +} + +/* Helpers to be used in the write_rx_mode implementation */ +static inline int netif_rx_mode_get_cfg(struct net_device *dev, int b) +{ + return !!(dev->rx_mode_ctx->ready->cfg & BIT(b)); +} + +static inline int netif_rx_mode_get_flag(struct net_device *dev, int b) +{ + return !!(dev->rx_mode_ctx->flags & BIT(b)); +} + +static inline int netif_rx_mode_get_mc_count(struct net_device *dev) +{ + return dev->rx_mode_ctx->ready->mc_count; +} + +static inline int netif_rx_mode_get_uc_count(struct net_device *dev) +{ + return dev->rx_mode_ctx->ready->uc_count; +} + +void netif_schedule_rx_mode_work(struct net_device *dev); + +void netif_flush_rx_mode_work(struct net_device *dev); + +#define netif_rx_mode_for_each_uc_addr(dev, ha_addr, idx) \ + for (ha_addr =3D (dev)->rx_mode_ctx->ready->uc_addrs, idx =3D 0; \ + idx < netif_rx_mode_get_uc_count((dev)); \ + ha_addr +=3D (dev)->addr_len, idx++) + +#define netif_rx_mode_for_each_mc_addr(dev, ha_addr, idx) \ + for (ha_addr =3D (dev)->rx_mode_ctx->ready->mc_addrs, idx =3D 0; \ + idx < netif_rx_mode_get_mc_count((dev)); \ + ha_addr +=3D (dev)->addr_len, idx++) + int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev); int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id); =20 diff --git a/net/core/dev.c b/net/core/dev.c index 36dc5199037e..ffa0615b688e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1587,6 +1587,206 @@ void netif_state_change(struct net_device *dev) } } =20 +/* This function attempts to copy the current state of the + * net device into pending (reallocating if necessary). If it fails, + * pending is guaranteed to be unmodified. + */ +static int __netif_prepare_rx_mode(struct net_device *dev) +{ + struct netif_rx_mode_config *pending =3D dev->rx_mode_ctx->pending; + bool skip_uc =3D false, skip_mc =3D false; + int uc_count =3D 0, mc_count =3D 0; + struct netdev_hw_addr *ha; + char *tmp; + int i; + + skip_uc =3D netif_rx_mode_get_flag(dev, NETIF_RX_MODE_UC_SKIP); + skip_mc =3D netif_rx_mode_get_flag(dev, NETIF_RX_MODE_MC_SKIP); + + /* The allocations need to be atomic since this will be called under + * netif_addr_lock_bh() + */ + if (!skip_uc) { + uc_count =3D netdev_uc_count(dev); + tmp =3D krealloc(pending->uc_addrs, uc_count * dev->addr_len, + GFP_ATOMIC); + if (!tmp) + return -ENOMEM; + pending->uc_addrs =3D tmp; + } + + if (!skip_mc) { + mc_count =3D netdev_mc_count(dev); + tmp =3D krealloc(pending->mc_addrs, mc_count * dev->addr_len, + GFP_ATOMIC); + if (!tmp) + return -ENOMEM; + pending->mc_addrs =3D tmp; + } + + /* This function cannot fail after this point */ + + /* This is going to be the same for every single driver. Better to + * do it here than in the set_rx_mode impl + */ + netif_rx_mode_set_cfg(dev, NETIF_RX_MODE_CFG_ALLMULTI, + !!(dev->flags & IFF_ALLMULTI)); + + netif_rx_mode_set_cfg(dev, NETIF_RX_MODE_CFG_PROMISC, + !!(dev->flags & IFF_PROMISC)); + + i =3D 0; + if (!skip_uc) { + pending->uc_count =3D uc_count; + netdev_for_each_uc_addr(ha, dev) + memcpy(pending->uc_addrs + (i++) * dev->addr_len, + ha->addr, dev->addr_len); + } + + i =3D 0; + if (!skip_mc) { + pending->mc_count =3D mc_count; + netdev_for_each_mc_addr(ha, dev) + memcpy(pending->mc_addrs + (i++) * dev->addr_len, + ha->addr, dev->addr_len); + } + return 0; +} + +static void netif_prepare_rx_mode(struct net_device *dev) +{ + lockdep_assert_held(&dev->addr_list_lock); + int rc; + + rc =3D __netif_prepare_rx_mode(dev); + if (rc) + return; + + netif_rx_mode_set_flag(dev, NETIF_RX_MODE_READY, true); +} + +static void netif_write_rx_mode(struct work_struct *param) +{ + struct netif_rx_mode_ctx *ctx; + struct net_device *dev; + + rtnl_lock(); + ctx =3D container_of(param, struct netif_rx_mode_ctx, work); + dev =3D ctx->dev; + + if (!netif_running(dev)) { + rtnl_unlock(); + return; + } + + /* Paranoia. */ + if (WARN_ON(!dev->netdev_ops->ndo_write_rx_mode)) { + rtnl_unlock(); + return; + } + + /* We could introduce a new lock for this but reusing the addr + * lock works well enough + */ + netif_addr_lock_bh(dev); + + /* There's no point continuing if the pending config is not ready */ + if (!netif_rx_mode_get_flag(dev, NETIF_RX_MODE_READY)) { + netif_addr_unlock_bh(dev); + rtnl_unlock(); + return; + } + + swap(ctx->ready, ctx->pending); + netif_rx_mode_set_flag(dev, NETIF_RX_MODE_READY, false); + netif_addr_unlock_bh(dev); + + dev->netdev_ops->ndo_write_rx_mode(dev); + rtnl_unlock(); +} + +static int netif_alloc_rx_mode_ctx(struct net_device *dev) +{ + dev->rx_mode_ctx =3D kzalloc(sizeof(*dev->rx_mode_ctx), GFP_KERNEL); + if (!dev->rx_mode_ctx) + goto fail_all; + + dev->rx_mode_ctx->ready =3D kzalloc(sizeof(*dev->rx_mode_ctx->ready), + GFP_KERNEL); + if (!dev->rx_mode_ctx->ready) + goto fail_ready; + + dev->rx_mode_ctx->pending =3D kzalloc(sizeof(*dev->rx_mode_ctx->pending), + GFP_KERNEL); + if (!dev->rx_mode_ctx->pending) + goto fail_pending; + + dev->rx_mode_ctx->dev =3D dev; + INIT_WORK(&dev->rx_mode_ctx->work, netif_write_rx_mode); + return 0; + +fail_pending: + kfree(dev->rx_mode_ctx->ready); + +fail_ready: + kfree(dev->rx_mode_ctx); + +fail_all: + return -ENOMEM; +} + +static void netif_free_rx_mode_ctx(struct net_device *dev) +{ + if (!dev->rx_mode_ctx) + return; + + cancel_work_sync(&dev->rx_mode_ctx->work); + + kfree(dev->rx_mode_ctx->ready->uc_addrs); + kfree(dev->rx_mode_ctx->ready->mc_addrs); + kfree(dev->rx_mode_ctx->ready); + + kfree(dev->rx_mode_ctx->pending->uc_addrs); + kfree(dev->rx_mode_ctx->pending->mc_addrs); + kfree(dev->rx_mode_ctx->pending); + + kfree(dev->rx_mode_ctx); + dev->rx_mode_ctx =3D NULL; +} + +static void netif_cleanup_work_fn(struct work_struct *param) +{ + struct netif_cleanup_work *ctx; + struct net_device *dev; + + ctx =3D container_of(param, struct netif_cleanup_work, work); + dev =3D ctx->dev; + + if (dev->netdev_ops->ndo_write_rx_mode) + netif_free_rx_mode_ctx(dev); +} + +static int netif_alloc_cleanup_work(struct net_device *dev) +{ + dev->cleanup_work =3D kzalloc(sizeof(*dev->cleanup_work), GFP_KERNEL); + if (!dev->cleanup_work) + return -ENOMEM; + + dev->cleanup_work->dev =3D dev; + INIT_WORK(&dev->cleanup_work->work, netif_cleanup_work_fn); + return 0; +} + +static void netif_free_cleanup_work(struct net_device *dev) +{ + if (!dev->cleanup_work) + return; + + cancel_work_sync(&dev->cleanup_work->work); + kfree(dev->cleanup_work); + dev->cleanup_work =3D NULL; +} + /** * __netdev_notify_peers - notify network peers about existence of @dev, * to be called when rtnl lock is already held. @@ -1682,6 +1882,16 @@ static int __dev_open(struct net_device *dev, struct= netlink_ext_ack *extack) if (!ret && ops->ndo_open) ret =3D ops->ndo_open(dev); =20 + if (!ret && dev->needs_cleanup_work) { + if (!dev->cleanup_work) + ret =3D netif_alloc_cleanup_work(dev); + else + cancel_work_sync(&dev->cleanup_work->work); + } + + if (!ret && ops->ndo_write_rx_mode) + ret =3D netif_alloc_rx_mode_ctx(dev); + netpoll_poll_enable(dev); =20 if (ret) @@ -1755,6 +1965,9 @@ static void __dev_close_many(struct list_head *head) if (ops->ndo_stop) ops->ndo_stop(dev); =20 + if (dev->needs_cleanup_work) + schedule_work(&dev->cleanup_work->work); + netif_set_up(dev, false); netpoll_poll_enable(dev); } @@ -9623,6 +9836,47 @@ int netif_set_allmulti(struct net_device *dev, int i= nc, bool notify) return 0; } =20 +/* netif_schedule_rx_mode_work - Sets up the rx_config snapshot and + * schedules the deferred I/O. + */ +void netif_schedule_rx_mode_work(struct net_device *dev) +{ + const struct net_device_ops *ops =3D dev->netdev_ops; + + if (ops->ndo_set_rx_mode) + ops->ndo_set_rx_mode(dev); + + if (!ops->ndo_write_rx_mode) + return; + + /* This part is only for drivers that implement ndo_write_rx_mode */ + + /* If rx_mode set is to be skipped, we don't schedule the work */ + if (netif_rx_mode_get_flag(dev, NETIF_RX_MODE_SET_SKIP)) + return; + + netif_prepare_rx_mode(dev); + schedule_work(&dev->rx_mode_ctx->work); +} +EXPORT_SYMBOL(netif_schedule_rx_mode_work); + +/* Drivers that implement rx mode as work flush the work item when closing + * or suspending. This is the substitute for those calls. + */ +void netif_flush_rx_mode_work(struct net_device *dev) +{ + /* Calling this function with RTNL held will result in a deadlock. */ + if (WARN_ON(rtnl_is_locked())) + return; + + /* Doing nothing is enough to "flush" work on a closed interface */ + if (!netif_running(dev)) + return; + + flush_work(&dev->rx_mode_ctx->work); +} +EXPORT_SYMBOL(netif_flush_rx_mode_work); + /* * Upload unicast and multicast address lists to device and * configure RX filtering. When the device doesn't support unicast @@ -9631,8 +9885,6 @@ int netif_set_allmulti(struct net_device *dev, int in= c, bool notify) */ void __dev_set_rx_mode(struct net_device *dev) { - const struct net_device_ops *ops =3D dev->netdev_ops; - /* dev_open will call this function so the list will stay sane. */ if (!(dev->flags&IFF_UP)) return; @@ -9653,8 +9905,7 @@ void __dev_set_rx_mode(struct net_device *dev) } } =20 - if (ops->ndo_set_rx_mode) - ops->ndo_set_rx_mode(dev); + netif_schedule_rx_mode_work(dev); } =20 void dev_set_rx_mode(struct net_device *dev) @@ -11325,6 +11576,9 @@ int register_netdevice(struct net_device *dev) } } =20 + if (dev->netdev_ops->ndo_write_rx_mode) + dev->needs_cleanup_work =3D true; + if (((dev->hw_features | dev->features) & NETIF_F_HW_VLAN_CTAG_FILTER) && (!dev->netdev_ops->ndo_vlan_rx_add_vid || @@ -12068,6 +12322,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv= , const char *name, dev->real_num_rx_queues =3D rxqs; if (netif_alloc_rx_queues(dev)) goto free_all; + dev->ethtool =3D kzalloc(sizeof(*dev->ethtool), GFP_KERNEL_ACCOUNT); if (!dev->ethtool) goto free_all; @@ -12151,6 +12406,7 @@ void free_netdev(struct net_device *dev) kfree(dev->ethtool); netif_free_tx_queues(dev); netif_free_rx_queues(dev); + netif_free_cleanup_work(dev); =20 kfree(rcu_dereference_protected(dev->ingress_queue, 1)); =20 --=20 2.47.3 From nobody Sun Feb 8 10:44:13 2026 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BF88314A73 for ; Fri, 2 Jan 2026 18:06:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767377171; cv=none; b=jCiZFWlyPH1ffQLxV9JxW3PxukycCqGlGHe8lrLoe/wtWmZ0KnvwfJTajkfAqo/Um6LIpVQaTlcuCpwUAEzKiiif4pHznQVOIGEqnbXJhcXcXVZrLgiLyLkSlTf3lnSTcIdZr5GQ1EAGArdMRH53+9NkDtnM0gNmUpU8VLKBn2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767377171; c=relaxed/simple; bh=oetQtpSM3DjCX3/aeFuWgTQCahQBkDyO/iqkcjjmuuw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K5iaJKPAz6ZCrDFgUoce7X+XZp/ZFdGz70kwkGqoh/MeuB7MHlLfkHAcBn7e4xvaYKkYc6Egy5gWxPOJATIziN7WlCefF7mjhGD2QtwRLGb02UedQsdt7eQ+C6uam4TGdQ3mzPSrJ/Ulsx9Pw94LPTUbiKL7X+id8SeupE+UmY4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ISi8jBR3; arc=none smtp.client-ip=209.85.214.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ISi8jBR3" Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-2a1388cdac3so113357385ad.0 for ; Fri, 02 Jan 2026 10:06:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767377160; x=1767981960; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dfcgCbP6RhhVMiMr9Azv6F2aQ7T6/MJccCOzX/54vEE=; b=ISi8jBR3puCj3RPUhCjqDezSdK1NmfG1a2nm6j6QUUa/TudJetCSJdhFJLOAIknsrt XZpWKgNCJfuRAM8PYDwMhyuXG3huXnx3H5n+VZiggPV6zv/dheod3HId05xLEKCDD5kL oYb9IXAMoDTc6+9cQyQQyMYpzUBI4yeGHNBR1hKp4bLyOVeMXuR33sJfUY3vqCTI8Rbt jMt3VUV/EbhhusPKArVA69h5WEAWFWlvyc4EkEgMx/pbdHcxn+u4rVzMpSkejSUcf9dl 3XlPOSdPJ2ppqMeN6nxImGFn0QI7vvuyTP0+cI+b4W5xiChvkNdgap0KYQI2uQI2hDzx UYJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767377160; x=1767981960; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=dfcgCbP6RhhVMiMr9Azv6F2aQ7T6/MJccCOzX/54vEE=; b=no03xdinuXu6zIXgCqOHSOeRpw8B98baFXndNtl4z6JuNLAJotI1AomCmzqTgT2nLW sfbGNT3TlUWliE3cAWJ///3/tr3FM/TcfHtUPAd5cR5IqbZ5cbo30dA1QJRaGpLi9C33 yplDpN3o3baAYuB5L7B6mhInymxpgRYp96pfl75lh5ZGYZTeJl5b6sunK1tP56QljiGL hL5me/L+ey94Ql/mmDQJA35YhGZMMaqBWOsW4vUN8ua3mKbDt3sdqTJNOh74RC1z8Dzg bqGFpabsqo/VUb+lmCz+Bi0LfBmn5snGm2NDQ86H2qhh72uU/z7t57Gm/dBt5ggOiXGd zGHA== X-Forwarded-Encrypted: i=1; AJvYcCWggg774n6Mn46qPCGAy7LPapXyZ70+vkvl1DXRT5IO56Yes3fXQai2nlp5VT48STfvUI/o3vi6XaUuXYg=@vger.kernel.org X-Gm-Message-State: AOJu0Yz9ymhN18UEVgCdtXxYfFUn6hTTKxPWZ0zUyEEjcMTz9RkK165f clEMGdfo0IOjC23hl1pyNzo8wXIT1dmzUoJOjKpZXNl/ZCBwYpix9V/v X-Gm-Gg: AY/fxX6gcZ/ARC85WNm4KtjdjLxRVQkWArohcxvz9tWtrVAqyZ6a1mdMe1Ld64kYgxU HVyLpR+GFIfzEm7o2Sxd5SBBe1XWyddq1iXTG+KGNgTZDt/1Y53Ed4z2bYosO1UjNr+Vew+dtQy OfOzBfLSkF9YMPtNyXIJSQUAafDvLyJgiOHICOSmlBsQKhwuKA/ME9RtnJU9F8D7+HQgn9UG60t bkKeZTqvQIYylsBABSH9rHI2bjDz0D/1jacM9JbjuCzUrlb6BgblKDJBXLm84msbeMMQyeyLeD8 M1V4onHM1XnkE0Bs9XHbPXKgZRVZOt3a2JihT7fC3jQmvNQveQ/P5+6ut7qxEUG/UlFyIsvti4J xz6Lwjy2usKeaXJ+Y2XCiI/z4iYDsj6dxCEMWk7wl0KKsJWlt0y6lwKqkhityoVGNWwFn2dNBGT AyjDxxbn5t1uX4fPLy4dN9+T4AjZF4m1k8cN6pu/PJx2c7OYypBzoeYWGjbqdaIgn4pWA= X-Google-Smtp-Source: AGHT+IGlPf0sJytEVblRsx7DE7M/l8a4Fbt7aqtuWECHLRxylpFO/PMXbrQ9354tq/9aQ/e2ZEXM+g== X-Received: by 2002:a17:902:e74b:b0:299:e215:f61e with SMTP id d9443c01a7336-2a2f2a34fadmr407610955ad.36.1767377159960; Fri, 02 Jan 2026 10:05:59 -0800 (PST) Received: from localhost.localdomain ([223.181.108.198]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a2f3d77566sm386297585ad.97.2026.01.02.10.05.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Jan 2026 10:05:59 -0800 (PST) From: I Viswanath To: edumazet@google.com, andrew+netdev@lunn.ch, horms@kernel.org, kuba@kernel.org, pabeni@redhat.com, mst@redhat.com, eperezma@redhat.com, jasowang@redhat.com, xuanzhuo@linux.alibaba.com Cc: netdev@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, I Viswanath Subject: [PATCH net-next v7 2/2] virtio-net: Implement ndo_write_rx_mode callback Date: Fri, 2 Jan 2026 23:35:30 +0530 Message-ID: <20260102180530.1559514-3-viswanathiyyappan@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260102180530.1559514-1-viswanathiyyappan@gmail.com> References: <20260102180530.1559514-1-viswanathiyyappan@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement ndo_write_rx_mode callback for virtio-net Signed-off-by: I Viswanath --- drivers/net/virtio_net.c | 55 +++++++++++++++------------------------- 1 file changed, 21 insertions(+), 34 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 1bb3aeca66c6..83d543bf6ae2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -460,9 +460,6 @@ struct virtnet_info { /* Work struct for config space updates */ struct work_struct config_work; =20 - /* Work struct for setting rx mode */ - struct work_struct rx_mode_work; - /* OK to queue work setting RX mode? */ bool rx_mode_work_enabled; =20 @@ -3866,33 +3863,31 @@ static int virtnet_close(struct net_device *dev) return 0; } =20 -static void virtnet_rx_mode_work(struct work_struct *work) +static void virtnet_write_rx_mode(struct net_device *dev) { - struct virtnet_info *vi =3D - container_of(work, struct virtnet_info, rx_mode_work); + struct virtnet_info *vi =3D netdev_priv(dev); u8 *promisc_allmulti __free(kfree) =3D NULL; - struct net_device *dev =3D vi->dev; struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; - struct netdev_hw_addr *ha; + char *ha_addr; int uc_count; int mc_count; void *buf; + int idx; int i; =20 /* We can't dynamically set ndo_set_rx_mode, so return gracefully */ if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX)) return; =20 - promisc_allmulti =3D kzalloc(sizeof(*promisc_allmulti), GFP_KERNEL); + promisc_allmulti =3D kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); if (!promisc_allmulti) { dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); return; } =20 - rtnl_lock(); - - *promisc_allmulti =3D !!(dev->flags & IFF_PROMISC); + *promisc_allmulti =3D netif_rx_mode_get_cfg(dev, + NETIF_RX_MODE_CFG_PROMISC); sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); =20 if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, @@ -3900,7 +3895,8 @@ static void virtnet_rx_mode_work(struct work_struct *= work) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", *promisc_allmulti ? "en" : "dis"); =20 - *promisc_allmulti =3D !!(dev->flags & IFF_ALLMULTI); + *promisc_allmulti =3D netif_rx_mode_get_cfg(dev, + NETIF_RX_MODE_CFG_ALLMULTI); sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); =20 if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, @@ -3908,27 +3904,22 @@ static void virtnet_rx_mode_work(struct work_struct= *work) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", *promisc_allmulti ? "en" : "dis"); =20 - netif_addr_lock_bh(dev); - - uc_count =3D netdev_uc_count(dev); - mc_count =3D netdev_mc_count(dev); + uc_count =3D netif_rx_mode_get_uc_count(dev); + mc_count =3D netif_rx_mode_get_mc_count(dev); /* MAC filter - use one buffer for both lists */ buf =3D kzalloc(((uc_count + mc_count) * ETH_ALEN) + (2 * sizeof(mac_data->entries)), GFP_ATOMIC); mac_data =3D buf; - if (!buf) { - netif_addr_unlock_bh(dev); - rtnl_unlock(); + if (!buf) return; - } =20 sg_init_table(sg, 2); =20 /* Store the unicast list and count in the front of the buffer */ mac_data->entries =3D cpu_to_virtio32(vi->vdev, uc_count); i =3D 0; - netdev_for_each_uc_addr(ha, dev) - memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN); + netif_rx_mode_for_each_uc_addr(dev, ha_addr, idx) + memcpy(&mac_data->macs[i++][0], ha_addr, ETH_ALEN); =20 sg_set_buf(&sg[0], mac_data, sizeof(mac_data->entries) + (uc_count * ETH_ALEN)); @@ -3938,10 +3929,8 @@ static void virtnet_rx_mode_work(struct work_struct = *work) =20 mac_data->entries =3D cpu_to_virtio32(vi->vdev, mc_count); i =3D 0; - netdev_for_each_mc_addr(ha, dev) - memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN); - - netif_addr_unlock_bh(dev); + netif_rx_mode_for_each_mc_addr(dev, ha_addr, idx) + memcpy(&mac_data->macs[i++][0], ha_addr, ETH_ALEN); =20 sg_set_buf(&sg[1], mac_data, sizeof(mac_data->entries) + (mc_count * ETH_ALEN)); @@ -3950,17 +3939,15 @@ static void virtnet_rx_mode_work(struct work_struct= *work) VIRTIO_NET_CTRL_MAC_TABLE_SET, sg)) dev_warn(&dev->dev, "Failed to set MAC filter table.\n"); =20 - rtnl_unlock(); - kfree(buf); } =20 static void virtnet_set_rx_mode(struct net_device *dev) { struct virtnet_info *vi =3D netdev_priv(dev); + char cfg_disabled =3D !vi->rx_mode_work_enabled; =20 - if (vi->rx_mode_work_enabled) - schedule_work(&vi->rx_mode_work); + netif_rx_mode_set_flag(dev, NETIF_RX_MODE_SET_SKIP, cfg_disabled); } =20 static int virtnet_vlan_rx_add_vid(struct net_device *dev, @@ -5776,7 +5763,7 @@ static void virtnet_freeze_down(struct virtio_device = *vdev) /* Make sure no work handler is accessing the device */ flush_work(&vi->config_work); disable_rx_mode_work(vi); - flush_work(&vi->rx_mode_work); + netif_flush_rx_mode_work(vi->dev); =20 if (netif_running(vi->dev)) { rtnl_lock(); @@ -6279,6 +6266,7 @@ static const struct net_device_ops virtnet_netdev =3D= { .ndo_validate_addr =3D eth_validate_addr, .ndo_set_mac_address =3D virtnet_set_mac_address, .ndo_set_rx_mode =3D virtnet_set_rx_mode, + .ndo_write_rx_mode =3D virtnet_write_rx_mode, .ndo_get_stats64 =3D virtnet_stats, .ndo_vlan_rx_add_vid =3D virtnet_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid =3D virtnet_vlan_rx_kill_vid, @@ -6900,7 +6888,6 @@ static int virtnet_probe(struct virtio_device *vdev) vdev->priv =3D vi; =20 INIT_WORK(&vi->config_work, virtnet_config_changed_work); - INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work); spin_lock_init(&vi->refill_lock); =20 if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) { @@ -7205,7 +7192,7 @@ static void virtnet_remove(struct virtio_device *vdev) /* Make sure no work handler is accessing the device. */ flush_work(&vi->config_work); disable_rx_mode_work(vi); - flush_work(&vi->rx_mode_work); + netif_flush_rx_mode_work(vi->dev); =20 virtnet_free_irq_moder(vi); =20 --=20 2.47.3