[RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write

I Viswanath posted 2 patches 1 week, 6 days ago
drivers/net/virtio_net.c  |  56 +++++------
include/linux/netdevice.h | 104 ++++++++++++++++++-
net/core/dev.c            | 207 +++++++++++++++++++++++++++++++++++++-
3 files changed, 328 insertions(+), 39 deletions(-)
[RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
Posted by I Viswanath 1 week, 6 days ago
This is an implementation of the idea provided by Jakub here

https://lore.kernel.org/netdev/20250923163727.5e97abdb@kernel.org/

ndo_set_rx_mode is problematic because it cannot sleep. 

To address this, this series proposes dividing the concept of setting
rx_mode into 2 stages: snapshot and deferred I/O. To achieve this, we
reinterpret set_rx_mode and add create a new ndo write_rx_mode as
explained below:

The new set_rx_mode will be responsible for customizing the rx_mode
snapshot which will be used by write_rx_mode to update the hardware

In brief, the new flow looks something like:

prepare_rx_mode():
    ndo_set_rx_mode();
    ready_snapshot();

write_rx_mode():
    commit_and_use_snapshot();
    ndo_write_rx_mode();

write_rx_mode() is called from a work item and doesn't hold the 
netif_addr_lock lock during ndo_write_rx_mode() making it sleepable
in that section.

This model should work correctly if the following conditions hold:

1. write_rx_mode should use the rx_mode set by the most recent
    call to prepare_rx_mode before its execution.

2. If a prepare_rx_mode call happens during execution of write_rx_mode,
    write_rx_mode should be rescheduled.

3. All calls to modify rx_mode should pass through the prepare_rx_mode +
    schedule write_rx_mode execution flow. netif_rx_mode_schedule_work 
    has been implemented in core for this.

1 and 2 are guaranteed because of the properties of work queues

Drivers need to ensure 3

To use this model, a driver needs to implement the
ndo_write_rx_mode callback, change the set_rx_mode callback
appropriately and replace all calls to modify rx mode with
netif_rx_mode_schedule_work

---

Questions I have:

1) Would there ever be a situation in which you will have to wait for 
I/O to complete in a call to set_rx_mode before proceeding further? 
That is, Does netif_rx_mode_schedule_work need the flush argument?

2) Does priv_ptr in netif_rx_mode_config make sense? For virtio_net, 
I can get the vi pointer by doing netdev_priv(dev) and 
am wondering if this would be a common thing

3) From a previous discussion: 
https://lore.kernel.org/netdev/417c677f-268a-4163-b07e-deea8f9b9b40@intel.com/

On Thu, 23 Oct 2025 at 05:16, Jacob Keller  wrote:
> Is there any mechanism to make this guarantee either implemented or at
> least verified by the core? If not that, what about some sort of way to
> lint driver code and make sure its correct?

I am not sure how to automate this but basically we need warnings to be generated
when the the set_rx_mode implementations are called normally in code (From my 
understanding, usually in the open callback or the timeout function) but not when 
they are called through ops->set_rx_mode. Can Coccinelle do something like this?

v1:
Link: https://lore.kernel.org/netdev/20251020134857.5820-1-viswanathiyyappan@gmail.com/

v2:
- Exported set_and_schedule_rx_config as a symbol for use in modules
- Fixed incorrect cleanup for the case of rx_work alloc failing in alloc_netdev_mqs
- Removed the locked version (cp_set_rx_mode) and renamed __cp_set_rx_mode to cp_set_rx_mode
Link: https://lore.kernel.org/netdev/20251026175445.1519537-1-viswanathiyyappan@gmail.com/

v3:
- Added RFT tag
- Corrected mangled patch
Link: https://lore.kernel.org/netdev/20251028174222.1739954-1-viswanathiyyappan@gmail.com/

v4:
- Completely reworked the snapshot mechanism as per v3 comments
- Implemented the callback for virtio-net instead of 8139cp driver
- Removed RFC tag

I Viswanath (2):
  net: refactor set_rx_mode into snapshot and deferred I/O
  virtio-net: Implement ndO_write_rx_mode callback

 drivers/net/virtio_net.c  |  56 +++++------
 include/linux/netdevice.h | 104 ++++++++++++++++++-
 net/core/dev.c            | 207 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 328 insertions(+), 39 deletions(-)

-- 
2.34.1
Re: [RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
Posted by I Viswanath 1 week, 6 days ago
Hi,

I am sorry that it is broken. I will submit a v5 as soon as possible
Re: [RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
Posted by Jakub Kicinski 1 week, 6 days ago
On Tue, 18 Nov 2025 22:26:42 +0530 I Viswanath wrote:
> I am sorry that it is broken. I will submit a v5 as soon as possible

Broken as in patch titles or broken as in it crashes the kernel?
Both are true. Just to be safe "as soon as possible" hopefully
takes into account our "no reposts within 24h" rule.
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html

Also I'm not sure what you mean by RFT, doubt anyone will test this 
for you, and you're modifying virtio which you should be able to test
yourself.. RFC or PATCH is the choice.
Re: [RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
Posted by I Viswanath 1 week, 5 days ago
On Wed, 19 Nov 2025 at 00:20, Jakub Kicinski <kuba@kernel.org> wrote:
>
> Broken as in patch titles or broken as in it crashes the kernel?
> Both are true.
Yeah, I will fix both

> Just to be safe "as soon as possible" hopefully
> takes into account our "no reposts within 24h" rule.
> https://www.kernel.org/doc/html/next/process/maintainer-netdev.html
Yes, It does

> Also I'm not sure what you mean by RFT, doubt anyone will test this
> for you, and you're modifying virtio which you should be able to test
> yourself.. RFC or PATCH is the choice.

Just to be clear, would testing packet flow with all the possible mode
combinations
under heavy traffic be sufficient and exhaustive? I think this should
be PATCH once
I sort everything out
Re: [RFT net-next v4 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
Posted by Jakub Kicinski 1 week, 4 days ago
On Wed, 19 Nov 2025 09:27:58 +0530 I Viswanath wrote:
> > Also I'm not sure what you mean by RFT, doubt anyone will test this
> > for you, and you're modifying virtio which you should be able to test
> > yourself.. RFC or PATCH is the choice.  
> 
> Just to be clear, would testing packet flow with all the possible mode
> combinations
> under heavy traffic be sufficient and exhaustive? I think this should
> be PATCH once I sort everything out

I don't think traffic matters all that much, we're talking about control
path change. It's more important to exercise all the address lists 
(l2 unicast, multicast) and rx modes (promisc, allmulti) etc.
Then whatever other paths may load to touching the relevant state in
the driver (eg virtnet_freeze_down(), not triggers that, suspend?,
migration?).

In terms of traffic simple ping or a few UDP packets would be enough,
mostly to confirm the filtering is working as expected.