net/bridge/br_vlan.c | 4 +++- net/dsa/switch.c | 22 ++++++++++++---------- 2 files changed, 15 insertions(+), 11 deletions(-)
While trying to figure out the hardware behavior of a DSA supported switch chip and printing various internal vlan state changes, I noticed that some flows never triggered adding the cpu port to vlans, preventing it from receiving any of the VLANs traffic. E.g. the following sequence would cause the cpu port not being member of the vlan, despite the bridge vlan output looking correct: $ ip link add swbridge type bridge vlan_filtering 1 vlan_default_pvid 1 $ ip link set lan1 master swbridge $ bridge vlan add dev lan1 vid 1 pvid untagged $ bridge vlan add dev swbridge vid 1 pvid untagged self Adding more printk debugging, I traced it br_vlan_add_existing() setting changed to true (since the vlan "gained" the pvid untagged flags), and then the dsa code ignoring the vlan notification, since it is a vlan for the cpu port that is updated. Then I noticed that deleting that vlan didn't work either: $ bridge vlan port vlan-id lan1 1 PVID Egress Untagged swbridge 1 PVID Egress Untagged $ bridge vlan del dev swbridge vid 1 self $ bridge vlan port vlan-id lan1 1 PVID Egress Untagged swbridge 1 Egress Untagged which is caused by the same issue, because from the dsa standpoint I am now trying to delete a non-existing vlan. After fixing that, both were now correctly working, but the configured vlan on the cpu port would be stuck with whatever the initial add set. E.g.: $ bridge vlan add dev swbridge vid 1 pvid untagged self $ bridge vlan add dev swbridge vid 1 self would change the flags in the vlandb, but the hardware configured vlan would still untag at egress to the cpu port. Patch two fixes this by allowing changed = true vlan add notifications for the cpu port, but skip the refcounting. Presumably with the patch there should never be a vlan add notification for the brentry with changed set to true anymore. In case my reasoning is wrong, I added a WARN_ON(), but I didn't get to trigger it so far. I did a check of all handlers of switchdev port vlan add notifications, but DSA seems to be the only one that even looks at the changed flag. Sent as RFC as I am not too familiar with the dsa/vlan code, so I might very well have overlooked something. Jonas Gorski (2): net: bridge: switchdev: do not notify new brentries as changed net: dsa: propagate brentry flag changes net/bridge/br_vlan.c | 4 +++- net/dsa/switch.c | 22 ++++++++++++---------- 2 files changed, 15 insertions(+), 11 deletions(-) -- 2.43.0
On Sat, Apr 12, 2025 at 02:24:26PM +0200, Jonas Gorski wrote: > While trying to figure out the hardware behavior of a DSA supported > switch chip and printing various internal vlan state changes, I noticed > that some flows never triggered adding the cpu port to vlans, preventing > it from receiving any of the VLANs traffic. > > E.g. the following sequence would cause the cpu port not being member of > the vlan, despite the bridge vlan output looking correct: > > $ ip link add swbridge type bridge vlan_filtering 1 vlan_default_pvid 1 > $ ip link set lan1 master swbridge At this step, dsa_port_bridge_join() -> switchdev_bridge_port_offload() -> ... -> br_switchdev_port_offload() -> nbp_switchdev_sync_objs() -> br_switchdev_vlan_replay() -> br_switchdev_vlan_replay_group(br_vlan_group(br)) -> br_switchdev_vlan_replay_one() should have notified DSA, with changed=false. It should be processed by dsa_user_host_vlan_add() -> dsa_port_host_vlan_add(). You make it sound like that doesn't happen. I notice you didn't mention which "DSA supported chip" you are using. By any chance, does its driver set ds->configure_vlan_while_not_filtering = false? That would be my prime suspect, making dsa_port_skip_vlan_configuration() ignore the code path above, because the bridge port is not yet VLAN filtering. It becomes VLAN filtering only a bit later in dsa_port_bridge_join(), with the dsa_port_switchdev_sync_attrs() -> dsa_port_vlan_filtering(br_vlan_enabled(br)) call. If that is the case, the only thing that is slightly confusing to me is why you haven't seen the "skipping configuration of VLAN" extack message. But anyway, if the theory above is true, you should instead be looking at adding proper VLAN support to said driver, and drop this set instead, because VLAN replay isn't working properly. > $ bridge vlan add dev lan1 vid 1 pvid untagged > $ bridge vlan add dev swbridge vid 1 pvid untagged self > > Adding more printk debugging, I traced it br_vlan_add_existing() setting > changed to true (since the vlan "gained" the pvid untagged flags), and > then the dsa code ignoring the vlan notification, since it is a vlan for > the cpu port that is updated. Yes, this part and everything that follows should be correct.
On Sat, Apr 12, 2025 at 3:34 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Sat, Apr 12, 2025 at 02:24:26PM +0200, Jonas Gorski wrote: > > While trying to figure out the hardware behavior of a DSA supported > > switch chip and printing various internal vlan state changes, I noticed > > that some flows never triggered adding the cpu port to vlans, preventing > > it from receiving any of the VLANs traffic. > > > > E.g. the following sequence would cause the cpu port not being member of > > the vlan, despite the bridge vlan output looking correct: > > > > $ ip link add swbridge type bridge vlan_filtering 1 vlan_default_pvid 1 > > $ ip link set lan1 master swbridge > > At this step, dsa_port_bridge_join() -> switchdev_bridge_port_offload() > -> ... -> br_switchdev_port_offload() -> nbp_switchdev_sync_objs() -> > br_switchdev_vlan_replay() -> br_switchdev_vlan_replay_group(br_vlan_group(br)) > -> br_switchdev_vlan_replay_one() should have notified DSA, with changed=false. > It should be processed by dsa_user_host_vlan_add() -> dsa_port_host_vlan_add(). > > You make it sound like that doesn't happen. Yes, because I messed up writing down what I did. That should have been vlan_default_pvid 0 so there are no VLANs configured by default. > > I notice you didn't mention which "DSA supported chip" you are using. > By any chance, does its driver set ds->configure_vlan_while_not_filtering = false? > That would be my prime suspect, making dsa_port_skip_vlan_configuration() ignore > the code path above, because the bridge port is not yet VLAN filtering. > It becomes VLAN filtering only a bit later in dsa_port_bridge_join(), > with the dsa_port_switchdev_sync_attrs() -> dsa_port_vlan_filtering(br_vlan_enabled(br)) > call. > > If that is the case, the only thing that is slightly confusing to me is > why you haven't seen the "skipping configuration of VLAN" extack message. > But anyway, if the theory above is true, you should instead be looking > at adding proper VLAN support to said driver, and drop this set instead, > because VLAN replay isn't working properly. It's b53, and on first glance it does not set it. But as I just noticed, I messed up the example. So adding the lan1 vid is supposed to be the very first vlan that is configured. > > > $ bridge vlan add dev lan1 vid 1 pvid untagged > > $ bridge vlan add dev swbridge vid 1 pvid untagged self > > > > Adding more printk debugging, I traced it br_vlan_add_existing() setting > > changed to true (since the vlan "gained" the pvid untagged flags), and > > then the dsa code ignoring the vlan notification, since it is a vlan for > > the cpu port that is updated. > > Yes, this part and everything that follows should be correct. Sorry for the mistake. I did it correctly while testing though! Best regards, Jonas
On Sat, Apr 12, 2025 at 3:50 PM Jonas Gorski <jonas.gorski@gmail.com> wrote: > > On Sat, Apr 12, 2025 at 3:34 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > > > On Sat, Apr 12, 2025 at 02:24:26PM +0200, Jonas Gorski wrote: > > > While trying to figure out the hardware behavior of a DSA supported > > > switch chip and printing various internal vlan state changes, I noticed > > > that some flows never triggered adding the cpu port to vlans, preventing > > > it from receiving any of the VLANs traffic. > > > > > > E.g. the following sequence would cause the cpu port not being member of > > > the vlan, despite the bridge vlan output looking correct: > > > > > > $ ip link add swbridge type bridge vlan_filtering 1 vlan_default_pvid 1 > > > $ ip link set lan1 master swbridge > > > > At this step, dsa_port_bridge_join() -> switchdev_bridge_port_offload() > > -> ... -> br_switchdev_port_offload() -> nbp_switchdev_sync_objs() -> > > br_switchdev_vlan_replay() -> br_switchdev_vlan_replay_group(br_vlan_group(br)) > > -> br_switchdev_vlan_replay_one() should have notified DSA, with changed=false. > > It should be processed by dsa_user_host_vlan_add() -> dsa_port_host_vlan_add(). > > > > You make it sound like that doesn't happen. > > Yes, because I messed up writing down what I did. That should have > been vlan_default_pvid 0 so there are no VLANs configured by default. > > > > I notice you didn't mention which "DSA supported chip" you are using. > > By any chance, does its driver set ds->configure_vlan_while_not_filtering = false? > > That would be my prime suspect, making dsa_port_skip_vlan_configuration() ignore > > the code path above, because the bridge port is not yet VLAN filtering. > > It becomes VLAN filtering only a bit later in dsa_port_bridge_join(), > > with the dsa_port_switchdev_sync_attrs() -> dsa_port_vlan_filtering(br_vlan_enabled(br)) > > call. > > > > If that is the case, the only thing that is slightly confusing to me is > > why you haven't seen the "skipping configuration of VLAN" extack message. > > But anyway, if the theory above is true, you should instead be looking > > at adding proper VLAN support to said driver, and drop this set instead, > > because VLAN replay isn't working properly. > > It's b53, and on first glance it does not set it. But as I just > noticed, I messed up the example. So I had the chance to properly check and b53 does not unset it, so it should be true. > So adding the lan1 vid is supposed to be the very first vlan that is configured. > > > > > > $ bridge vlan add dev lan1 vid 1 pvid untagged > > > $ bridge vlan add dev swbridge vid 1 pvid untagged self > > > > > > Adding more printk debugging, I traced it br_vlan_add_existing() setting > > > changed to true (since the vlan "gained" the pvid untagged flags), and > > > then the dsa code ignoring the vlan notification, since it is a vlan for > > > the cpu port that is updated. > > > > Yes, this part and everything that follows should be correct. So as an addendum that I probably should have included in the cover letter and commit message: Starting with no vlans configured on the bridge: $ bridge vlan add dev lan1 vid 1 pvid untagged $ bridge vlan add dev swbridge vid 1 pvid untagged self does not work, but $ bridge vlan add dev lan1 vid 1 pvid untagged $ bridge vlan add dev swbridge vid 1 self does properly trigger a configuration of the vlan on the cpu port. And the difference is that in the former, would_change => vlan->changed is true (due to "pvid untagged"), and in the latter it is not. Best regards, Jonas
© 2016 - 2025 Red Hat, Inc.