[PATCH v3] virtio-net: Fix VLAN filter table reset timing

Akihiko Odaki posted 1 patch 4 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250727-vlan-v3-1-bbee738619b1@rsg.ci.i.u-tokyo.ac.jp
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>
hw/net/virtio-net.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
[PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Akihiko Odaki 4 months, 3 weeks ago
Problem
-------

The expected initial state of the table depends on feature negotiation:

With VIRTIO_NET_F_CTRL_VLAN:
  The table must be empty in accordance with the specification.
Without VIRTIO_NET_F_CTRL_VLAN:
  The table must be filled to permit all VLAN traffic.

Prior to commit 06b636a1e2ad ("virtio-net: do not reset vlan filtering
at set_features"), virtio_net_set_features() always reset the VLAN
table. That commit changed the behavior to skip table reset when
VIRTIO_NET_F_CTRL_VLAN was negotiated, assuming the table would be
properly cleared during device reset and remain stable.

However, this assumption breaks when a driver renegotiates features:
1. Initial negotiation without VIRTIO_NET_F_CTRL_VLAN (table filled)
2. Renegotiation with VIRTIO_NET_F_CTRL_VLAN (table will not be cleared)

The problem was exacerbated by commit 0caed25cd171 ("virtio: Call
set_features during reset"), which triggered virtio_net_set_features()
during device reset, exposing the bug whenever VIRTIO_NET_F_CTRL_VLAN
was negotiated after a device reset.

Solution
--------

Fix the issue by initializing the table when virtio_net_set_features()
is called to change the VIRTIO_NET_F_CTRL_VLAN bit of
vdev->guest_features.

This approach ensures the correct table state regardless of feature
negotiation sequence by performing initialization in
virtio_net_set_features() as QEMU did prior to commit 06b636a1e2ad
("virtio-net: do not reset vlan filtering at set_features").

This change still preserves the goal of the commit, which was to avoid
resetting the table during migration, by checking whether the
VIRTIO_NET_F_CTRL_VLAN bit of vdev->guest_features is being changed;
vdev->guest_features is set before virtio_net_set_features() gets called
during migration.

It also avoids resetting the table when the driver sets a feature
bitmask with no change for the VIRTIO_NET_F_CTRL_VLAN bit, which makes
the operation idempotent and its semantics cleaner.

Additionally, this change ensures the table is initialized after
feature negotiation and before the DRIVER_OK status bit being set for
compatibility with the Linux driver before commit 50c0ada627f5
("virtio-net: fix race between ndo_open() and virtio_device_ready()"),
which did not ensure to set the DRIVER_OK status bit before modifying
the table.

Fixes: 06b636a1e2ad ("virtio-net: do not reset vlan filtering at set_features")
Cc: qemu-stable@nongnu.org
Reported-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Tested-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Tested-by: Lei Yang <leiyang@redhat.com>
---
Not tested.

Konstantin, I would also want you to test this new version. Please also
give it Tested-by, and, if possible, Reviewed-by.
---
Changes in v3:
- Dropped RFC.
- Rebased.
- Link to v2: https://lore.kernel.org/qemu-devel/20250714-vlan-v2-1-2d589ba4dcd3@rsg.ci.i.u-tokyo.ac.jp

Changes in v2:
- Addressed a concern with old drivers that do not properly set
  DRIVER_OK (pointed out by Michael S. Tsirkin).
- Noted that this change does not simply revert commit 06b636a1e2ad
  ("virtio-net: do not reset vlan filtering at set_features") but
  preserves its goal.
- Added Cc: qemu-stable@nongnu.org.
- Link to v1: https://lore.kernel.org/qemu-devel/20250713-vlan-v1-1-a3cf0bcfa644@rsg.ci.i.u-tokyo.ac.jp
---
 hw/net/virtio-net.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index c4c49b0f9caa1e8f26fa2e03acc8786936877eba..6b5b5dace334af12e9b77a8a2765c88443cee235 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -929,8 +929,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
         vhost_net_save_acked_features(nc->peer);
     }
 
-    if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
-        memset(n->vlans, 0xff, MAX_VLAN >> 3);
+    if (virtio_has_feature(vdev->guest_features ^ features, VIRTIO_NET_F_CTRL_VLAN)) {
+        bool vlan = virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN);
+        memset(n->vlans, vlan ? 0 : 0xff, MAX_VLAN >> 3);
     }
 
     if (virtio_has_feature(features, VIRTIO_NET_F_STANDBY)) {
@@ -3942,6 +3943,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
     n->mac_table.macs = g_malloc0(MAC_TABLE_ENTRIES * ETH_ALEN);
 
     n->vlans = g_malloc0(MAX_VLAN >> 3);
+    memset(n->vlans, 0xff, MAX_VLAN >> 3);
 
     nc = qemu_get_queue(n->nic);
     nc->rxfilter_notify_enabled = 1;
@@ -4041,7 +4043,6 @@ static void virtio_net_reset(VirtIODevice *vdev)
     memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
     memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac));
     qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
-    memset(n->vlans, 0, MAX_VLAN >> 3);
 
     /* Flush any async TX */
     for (i = 0;  i < n->max_queue_pairs; i++) {

---
base-commit: 9e601684dc24a521bb1d23215a63e5c6e79ea0bb
change-id: 20250713-vlan-8c107a65ad91

Best regards,
-- 
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Re: [PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Michael Tokarev 4 months, 2 weeks ago
On 27.07.2025 09:22, Akihiko Odaki wrote:
...
> @@ -3942,6 +3943,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
>       n->mac_table.macs = g_malloc0(MAC_TABLE_ENTRIES * ETH_ALEN);
>   
>       n->vlans = g_malloc0(MAX_VLAN >> 3);
> +    memset(n->vlans, 0xff, MAX_VLAN >> 3);

A nitpick: we don't need to init this memory with 0 before
initing it with 0xff.

But looking at this, why can't we embed n->vlans directly into
this structure, something like the attached patch?

This, and maybe a few other fields like it?

Thanks,

/mjt
Re: [PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Akihiko Odaki 4 months, 2 weeks ago
On 2025/08/02 16:26, Michael Tokarev wrote:
> On 27.07.2025 09:22, Akihiko Odaki wrote:
> ...
>> @@ -3942,6 +3943,7 @@ static void 
>> virtio_net_device_realize(DeviceState *dev, Error **errp)
>>       n->mac_table.macs = g_malloc0(MAC_TABLE_ENTRIES * ETH_ALEN);
>>       n->vlans = g_malloc0(MAX_VLAN >> 3);
>> +    memset(n->vlans, 0xff, MAX_VLAN >> 3);
> 
> A nitpick: we don't need to init this memory with 0 before
> initing it with 0xff.
> 
> But looking at this, why can't we embed n->vlans directly into
> this structure, something like the attached patch?

VMState also needs a change: VMSTATE_BUFFER_POINTER_UNSAFE() should be 
replaced with plain VMSTATE_BUFFER(). Actually this is the only user of 
VMSTATE_BUFFER_POINTER_UNSAFE().

I appreciate if you submit the patch with this VMState change and patch 
message.

> 
> This, and maybe a few other fields like it?

There is another candidate: n->mac_table.macs

But it is not straightforward to embed this array because it uses 
VMSTATE_VBUFFER_MULTIPLY().

Regards,
Akihiko Odaki

Re: [PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Lei Yang 4 months, 2 weeks ago
Tested pass this patch with virtio-net regression tests, everything works well.

Tested-by: Lei Yang <leiyang@redhat.com>

On Sat, Aug 2, 2025 at 5:43 PM Akihiko Odaki
<odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
>
> On 2025/08/02 16:26, Michael Tokarev wrote:
> > On 27.07.2025 09:22, Akihiko Odaki wrote:
> > ...
> >> @@ -3942,6 +3943,7 @@ static void
> >> virtio_net_device_realize(DeviceState *dev, Error **errp)
> >>       n->mac_table.macs = g_malloc0(MAC_TABLE_ENTRIES * ETH_ALEN);
> >>       n->vlans = g_malloc0(MAX_VLAN >> 3);
> >> +    memset(n->vlans, 0xff, MAX_VLAN >> 3);
> >
> > A nitpick: we don't need to init this memory with 0 before
> > initing it with 0xff.
> >
> > But looking at this, why can't we embed n->vlans directly into
> > this structure, something like the attached patch?
>
> VMState also needs a change: VMSTATE_BUFFER_POINTER_UNSAFE() should be
> replaced with plain VMSTATE_BUFFER(). Actually this is the only user of
> VMSTATE_BUFFER_POINTER_UNSAFE().
>
> I appreciate if you submit the patch with this VMState change and patch
> message.
>
> >
> > This, and maybe a few other fields like it?
>
> There is another candidate: n->mac_table.macs
>
> But it is not straightforward to embed this array because it uses
> VMSTATE_VBUFFER_MULTIPLY().
>
> Regards,
> Akihiko Odaki
>
Re: [PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Michael Tokarev 4 months, 2 weeks ago
On 07.08.2025 04:20, Lei Yang wrote:
> Tested pass this patch with virtio-net regression tests, everything works well.
> 
> Tested-by: Lei Yang <leiyang@redhat.com>

I think it's too late by now, since this patch has been committed
to the master branch in the qemu git repository already.

Thanks,

/mjt
Re: [PATCH v3] virtio-net: Fix VLAN filter table reset timing
Posted by Konstantin Shkolnyy 4 months, 3 weeks ago
On 27-Jul-25 01:22, Akihiko Odaki wrote:
> Problem
> -------
> 
> The expected initial state of the table depends on feature negotiation:
> 
> With VIRTIO_NET_F_CTRL_VLAN:
>    The table must be empty in accordance with the specification.
> Without VIRTIO_NET_F_CTRL_VLAN:
>    The table must be filled to permit all VLAN traffic.
> 
> Prior to commit 06b636a1e2ad ("virtio-net: do not reset vlan filtering
> at set_features"), virtio_net_set_features() always reset the VLAN
> table. That commit changed the behavior to skip table reset when
> VIRTIO_NET_F_CTRL_VLAN was negotiated, assuming the table would be
> properly cleared during device reset and remain stable.
> 
> However, this assumption breaks when a driver renegotiates features:
> 1. Initial negotiation without VIRTIO_NET_F_CTRL_VLAN (table filled)
> 2. Renegotiation with VIRTIO_NET_F_CTRL_VLAN (table will not be cleared)
> 
> The problem was exacerbated by commit 0caed25cd171 ("virtio: Call
> set_features during reset"), which triggered virtio_net_set_features()
> during device reset, exposing the bug whenever VIRTIO_NET_F_CTRL_VLAN
> was negotiated after a device reset.
> 
> Solution
> --------
> 
> Fix the issue by initializing the table when virtio_net_set_features()
> is called to change the VIRTIO_NET_F_CTRL_VLAN bit of
> vdev->guest_features.
> 
> This approach ensures the correct table state regardless of feature
> negotiation sequence by performing initialization in
> virtio_net_set_features() as QEMU did prior to commit 06b636a1e2ad
> ("virtio-net: do not reset vlan filtering at set_features").
> 
> This change still preserves the goal of the commit, which was to avoid
> resetting the table during migration, by checking whether the
> VIRTIO_NET_F_CTRL_VLAN bit of vdev->guest_features is being changed;
> vdev->guest_features is set before virtio_net_set_features() gets called
> during migration.
> 
> It also avoids resetting the table when the driver sets a feature
> bitmask with no change for the VIRTIO_NET_F_CTRL_VLAN bit, which makes
> the operation idempotent and its semantics cleaner.
> 
> Additionally, this change ensures the table is initialized after
> feature negotiation and before the DRIVER_OK status bit being set for
> compatibility with the Linux driver before commit 50c0ada627f5
> ("virtio-net: fix race between ndo_open() and virtio_device_ready()"),
> which did not ensure to set the DRIVER_OK status bit before modifying
> the table.
> 
> Fixes: 06b636a1e2ad ("virtio-net: do not reset vlan filtering at set_features")
> Cc: qemu-stable@nongnu.org
> Reported-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> Tested-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
> Tested-by: Lei Yang <leiyang@redhat.com>
> ---
> Not tested.
> 
> Konstantin, I would also want you to test this new version. Please also
> give it Tested-by, and, if possible, Reviewed-by.
> ---
> Changes in v3:
> - Dropped RFC.
> - Rebased.
> - Link to v2: https://lore.kernel.org/qemu-devel/20250714-vlan-v2-1-2d589ba4dcd3@rsg.ci.i.u-tokyo.ac.jp
> 
> Changes in v2:
> - Addressed a concern with old drivers that do not properly set
>    DRIVER_OK (pointed out by Michael S. Tsirkin).
> - Noted that this change does not simply revert commit 06b636a1e2ad
>    ("virtio-net: do not reset vlan filtering at set_features") but
>    preserves its goal.
> - Added Cc: qemu-stable@nongnu.org.
> - Link to v1: https://lore.kernel.org/qemu-devel/20250713-vlan-v1-1-a3cf0bcfa644@rsg.ci.i.u-tokyo.ac.jp
> ---
>   hw/net/virtio-net.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index c4c49b0f9caa1e8f26fa2e03acc8786936877eba..6b5b5dace334af12e9b77a8a2765c88443cee235 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -929,8 +929,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
>           vhost_net_save_acked_features(nc->peer);
>       }
>   
> -    if (!virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
> -        memset(n->vlans, 0xff, MAX_VLAN >> 3);
> +    if (virtio_has_feature(vdev->guest_features ^ features, VIRTIO_NET_F_CTRL_VLAN)) {
> +        bool vlan = virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN);
> +        memset(n->vlans, vlan ? 0 : 0xff, MAX_VLAN >> 3);
>       }
>   
>       if (virtio_has_feature(features, VIRTIO_NET_F_STANDBY)) {
> @@ -3942,6 +3943,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
>       n->mac_table.macs = g_malloc0(MAC_TABLE_ENTRIES * ETH_ALEN);
>   
>       n->vlans = g_malloc0(MAX_VLAN >> 3);
> +    memset(n->vlans, 0xff, MAX_VLAN >> 3);
>   
>       nc = qemu_get_queue(n->nic);
>       nc->rxfilter_notify_enabled = 1;
> @@ -4041,7 +4043,6 @@ static void virtio_net_reset(VirtIODevice *vdev)
>       memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
>       memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac));
>       qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
> -    memset(n->vlans, 0, MAX_VLAN >> 3);
>   
>       /* Flush any async TX */
>       for (i = 0;  i < n->max_queue_pairs; i++) {
> 
> ---
> base-commit: 9e601684dc24a521bb1d23215a63e5c6e79ea0bb
> change-id: 20250713-vlan-8c107a65ad91
> 
> Best regards,

It works on s390.
Tested-by: Konstantin Shkolnyy <kshk@linux.ibm.com>

(I can't give a reliable Reviewed-By as I lack good understanding of 
this code, but the change makes sense to me.)