[PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports

Jonas Gorski posted 3 patches 3 months ago
[PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Jonas Gorski 3 months ago
Documentation/networking/switchdev.rst says:

- with VLAN filtering turned off, the bridge will process all ingress
  traffic for the port, except for the traffic tagged with a VLAN ID
  destined for a VLAN upper.

But there is currently no way to configure this in dsa. The vlan upper
will trigger a vlan add to the driver, but it is the same message as a
newly configured bridge VLAN.

Therefore traffic tagged with the VID will continue to be forwarded to
other ports, and therefore we cannot support VLAN uppers on ports of a
VLAN unaware bridges.

Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
---
 net/dsa/port.c | 23 ++++-------------------
 net/dsa/user.c | 38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index 082573ae6864..d7746885f7e0 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -728,35 +728,20 @@ static bool dsa_port_can_apply_vlan_filtering(struct dsa_port *dp,
 {
 	struct dsa_switch *ds = dp->ds;
 	struct dsa_port *other_dp;
-	int err;
 
-	/* VLAN awareness was off, so the question is "can we turn it on".
+	/* VLAN awareness was on, so the question is "can we turn it off".
 	 * We may have had 8021q uppers, those need to go. Make sure we don't
 	 * enter an inconsistent state: deny changing the VLAN awareness state
 	 * as long as we have 8021q uppers.
 	 */
-	if (vlan_filtering && dsa_port_is_user(dp)) {
-		struct net_device *br = dsa_port_bridge_dev_get(dp);
+	if (!vlan_filtering && dsa_port_is_user(dp)) {
 		struct net_device *upper_dev, *user = dp->user;
 		struct list_head *iter;
 
 		netdev_for_each_upper_dev_rcu(user, upper_dev, iter) {
-			struct bridge_vlan_info br_info;
-			u16 vid;
-
-			if (!is_vlan_dev(upper_dev))
-				continue;
-
-			vid = vlan_dev_vlan_id(upper_dev);
-
-			/* br_vlan_get_info() returns -EINVAL or -ENOENT if the
-			 * device, respectively the VID is not found, returning
-			 * 0 means success, which is a failure for us here.
-			 */
-			err = br_vlan_get_info(br, vid, &br_info);
-			if (err == 0) {
+			if (is_vlan_dev(upper_dev)) {
 				NL_SET_ERR_MSG_MOD(extack,
-						   "Must first remove VLAN uppers having VIDs also present in bridge");
+						   "Must first remove VLAN uppers from bridged ports");
 				return false;
 			}
 		}
diff --git a/net/dsa/user.c b/net/dsa/user.c
index e8c6452780b0..35265829aa90 100644
--- a/net/dsa/user.c
+++ b/net/dsa/user.c
@@ -3156,6 +3156,30 @@ dsa_prevent_bridging_8021q_upper(struct net_device *dev,
 	return NOTIFY_DONE;
 }
 
+/* Must be called under rcu_read_lock() */
+static int
+dsa_user_vlan_check_for_any_8021q_uppers(struct dsa_port *dp)
+{
+	struct dsa_switch *ds = dp->ds;
+	struct dsa_port *other_dp;
+
+	dsa_switch_for_each_user_port(other_dp, ds) {
+		struct net_device *user = other_dp->user;
+		struct net_device *upper_dev;
+		struct list_head *iter;
+
+		if (!dsa_port_bridge_same(dp, other_dp))
+			continue;
+
+		netdev_for_each_upper_dev_rcu(user, upper_dev, iter) {
+			if (is_vlan_dev(upper_dev))
+				return -EBUSY;
+		}
+	}
+
+	return 0;
+}
+
 static int
 dsa_user_check_8021q_upper(struct net_device *dev,
 			   struct netdev_notifier_changeupper_info *info)
@@ -3167,10 +3191,22 @@ dsa_user_check_8021q_upper(struct net_device *dev,
 	int err = NOTIFY_DONE;
 	u16 vid;
 
-	if (!br || !br_vlan_enabled(br))
+	if (!br)
 		return NOTIFY_DONE;
 
 	extack = netdev_notifier_info_to_extack(&info->info);
+
+	if (!br_vlan_enabled(br)) {
+		rcu_read_lock();
+		err = dsa_user_vlan_check_for_any_8021q_uppers(dp);
+		rcu_read_unlock();
+		if (err) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "VLAN uppers not supported with non filtering bridges");
+			return notifier_from_errno(err);
+		}
+	}
+
 	vid = vlan_dev_vlan_id(info->upper_dev);
 
 	/* br_vlan_get_info() returns -EINVAL or -ENOENT if the
-- 
2.43.0
Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Vladimir Oltean 3 months ago
On Mon, Nov 10, 2025 at 10:44:43PM +0100, Jonas Gorski wrote:
> Documentation/networking/switchdev.rst says:
> 
> - with VLAN filtering turned off, the bridge will process all ingress
>   traffic for the port, except for the traffic tagged with a VLAN ID
>   destined for a VLAN upper.
> 
> But there is currently no way to configure this in dsa. The vlan upper
> will trigger a vlan add to the driver, but it is the same message as a
> newly configured bridge VLAN.

hmm, not necessarily. vlan_vid_add() will only go through with
vlan_add_rx_filter_info() -> dev->netdev_ops->ndo_vlan_rx_add_vid()
if the device is vlan_hw_filter_capable().

And that's the key, DSA user ports only(*) become vlan_hw_filter_capable()
when under a VLAN _aware_ bridge. (*)actually here is the exception
you're probably hitting: due to the ds->vlan_filtering_is_global quirk,
unrelated ports become vlan_hw_filter_capable() too, not just the ones
under the VLAN-aware bridge. This is possibly what you're seeing and the
reason for the incorrect conclusion that VLAN-unaware bridge ports have
the behaviour you mention.

> Therefore traffic tagged with the VID will continue to be forwarded to
> other ports, and therefore we cannot support VLAN uppers on ports of a
> VLAN unaware bridges.

Incorrect premise => incorrect conclusion.
(not to say that an uncaught problem isn't there for ds->vlan_filtering_is_global
switches, but this isn't it)
Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Jonas Gorski 2 months, 4 weeks ago
On Mon, Nov 10, 2025 at 11:25 PM Vladimir Oltean
<vladimir.oltean@nxp.com> wrote:
>
> On Mon, Nov 10, 2025 at 10:44:43PM +0100, Jonas Gorski wrote:
> > Documentation/networking/switchdev.rst says:
> >
> > - with VLAN filtering turned off, the bridge will process all ingress
> >   traffic for the port, except for the traffic tagged with a VLAN ID
> >   destined for a VLAN upper.
> >
> > But there is currently no way to configure this in dsa. The vlan upper
> > will trigger a vlan add to the driver, but it is the same message as a
> > newly configured bridge VLAN.
>
> hmm, not necessarily. vlan_vid_add() will only go through with
> vlan_add_rx_filter_info() -> dev->netdev_ops->ndo_vlan_rx_add_vid()
> if the device is vlan_hw_filter_capable().
>
> And that's the key, DSA user ports only(*) become vlan_hw_filter_capable()
> when under a VLAN _aware_ bridge. (*)actually here is the exception
> you're probably hitting: due to the ds->vlan_filtering_is_global quirk,
> unrelated ports become vlan_hw_filter_capable() too, not just the ones
> under the VLAN-aware bridge. This is possibly what you're seeing and the
> reason for the incorrect conclusion that VLAN-unaware bridge ports have
> the behaviour you mention.

Ah, right, no call at all.

But this is about a bridge with VLAN filtering disabled, so filtering
isn't enabled on any port, so ds->vlan_filtering_is_global does not
matter.

See my examples in my reply to 0/3, which hopefully clarify what I am
trying to prevent here.

> > Therefore traffic tagged with the VID will continue to be forwarded to
> > other ports, and therefore we cannot support VLAN uppers on ports of a
> > VLAN unaware bridges.
>
> Incorrect premise => incorrect conclusion.
> (not to say that an uncaught problem isn't there for ds->vlan_filtering_is_global
> switches, but this isn't it)

This should happen regardless of vlan filtering is global.

But I noticed while testing that apparently b53 in filtering=0 mode
does not forward any tagged traffic (and I think I know why ...).

Is there a way to ask for a replay of the fdb (static) entries? To fix
this for older switches, we need to disable 802.1q mode, but this also
switches the ARL from IVL to SVL, which changes the hashing, and would
break any existing entries. So we need to flush the ARL before
toggling 802.1q mode, and then reprogram any static entries.

Best regards,
Jonas
Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Vladimir Oltean 2 months, 4 weeks ago
On Tue, Nov 11, 2025 at 11:06:48AM +0100, Jonas Gorski wrote:
> But I noticed while testing that apparently b53 in filtering=0 mode
> does not forward any tagged traffic (and I think I know why ...).
> 
> Is there a way to ask for a replay of the fdb (static) entries? To fix
> this for older switches, we need to disable 802.1q mode, but this also
> switches the ARL from IVL to SVL, which changes the hashing, and would
> break any existing entries. So we need to flush the ARL before
> toggling 802.1q mode, and then reprogram any static entries.

I'm not clear on what happens. "Broken" FDB entries in the incorrect
bridge vlan_filtering mode sounds like normal behaviour (FDB entries
with VID=0 while vlan_filtering=1, or FDB entries with VID!=0 while
vlan_filtering=0). They should just sit idle in the ARL until the VLAN
filtering mode makes them active.
Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Jonas Gorski 2 months, 4 weeks ago
On Tue, Nov 11, 2025 at 12:56 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Tue, Nov 11, 2025 at 11:06:48AM +0100, Jonas Gorski wrote:
> > But I noticed while testing that apparently b53 in filtering=0 mode
> > does not forward any tagged traffic (and I think I know why ...).
> >
> > Is there a way to ask for a replay of the fdb (static) entries? To fix
> > this for older switches, we need to disable 802.1q mode, but this also
> > switches the ARL from IVL to SVL, which changes the hashing, and would
> > break any existing entries. So we need to flush the ARL before
> > toggling 802.1q mode, and then reprogram any static entries.
>
> I'm not clear on what happens. "Broken" FDB entries in the incorrect
> bridge vlan_filtering mode sounds like normal behaviour (FDB entries
> with VID=0 while vlan_filtering=1, or FDB entries with VID!=0 while
> vlan_filtering=0). They should just sit idle in the ARL until the VLAN
> filtering mode makes them active.

When in SVL mode (vlan disabled), the ARL switches from mac+vid to
just mac for hashing ARL entries. And I don't know if mac+vid=0 yields
the same hash as only mac. It would it the switch uses vid=0 when not
vlan aware, but if it skips the vid then it wouldn't.

And we automatically install static entries for the MAC addresses of
ports (and maybe other non-dsa bridged devices), so we may need to
have these twice in the ARL table (once for non-filtering, once for
filtering).

If the hash is not the same, this can happen:

vlan_enabled=1, ARL hashing uses mac+vid
add static entry mac=abc,vid=0 for port 1 => hash(mac, vid) -> entry 123
vlan_enabled => 0, ARL hashing uses only mac
packet received on port 2 for mac=abc => hash(mac) => entry 456 => no
entry found => flood (which may not include port 1).

when trying to delete the static entry => lookup for mac=abc,vid=0 =>
hash(mac) => entry 456 => no such entry.

Then maybe we ignore the error, but the moment we enable vlan again,
the hashing changes back to mac+vid, so the "deleted" static entry
becomes active again (despite the linux fdb not knowing about it
anymore).

And even if the hash is the same, it would mean we cannot interact
with any preexisting entries for vid!=1 that were added with vlan
filtering = 1. So we cannot delete them when e.g. removing a port from
the bridge, or deleting the bridge.

So the safest would be to remove all static entries before changing
vlan filtering, and then re-adding them afterwards.

Best regards,
Jonas
Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
Posted by Vladimir Oltean 2 months, 4 weeks ago
On Tue, Nov 11, 2025 at 03:09:08PM +0100, Jonas Gorski wrote:
> On Tue, Nov 11, 2025 at 12:56 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Tue, Nov 11, 2025 at 11:06:48AM +0100, Jonas Gorski wrote:
> > > But I noticed while testing that apparently b53 in filtering=0 mode
> > > does not forward any tagged traffic (and I think I know why ...).
> > >
> > > Is there a way to ask for a replay of the fdb (static) entries? To fix
> > > this for older switches, we need to disable 802.1q mode, but this also
> > > switches the ARL from IVL to SVL, which changes the hashing, and would
> > > break any existing entries. So we need to flush the ARL before
> > > toggling 802.1q mode, and then reprogram any static entries.
> >
> > I'm not clear on what happens. "Broken" FDB entries in the incorrect
> > bridge vlan_filtering mode sounds like normal behaviour (FDB entries
> > with VID=0 while vlan_filtering=1, or FDB entries with VID!=0 while
> > vlan_filtering=0). They should just sit idle in the ARL until the VLAN
> > filtering mode makes them active.
> 
> When in SVL mode (vlan disabled), the ARL switches from mac+vid to
> just mac for hashing ARL entries. And I don't know if mac+vid=0 yields
> the same hash as only mac. It would it the switch uses vid=0 when not
> vlan aware, but if it skips the vid then it wouldn't.
> 
> And we automatically install static entries for the MAC addresses of
> ports (and maybe other non-dsa bridged devices), so we may need to
> have these twice in the ARL table (once for non-filtering, once for
> filtering).
> 
> If the hash is not the same, this can happen:
> 
> vlan_enabled=1, ARL hashing uses mac+vid
> add static entry mac=abc,vid=0 for port 1 => hash(mac, vid) -> entry 123
> vlan_enabled => 0, ARL hashing uses only mac
> packet received on port 2 for mac=abc => hash(mac) => entry 456 => no
> entry found => flood (which may not include port 1).
> 
> when trying to delete the static entry => lookup for mac=abc,vid=0 =>
> hash(mac) => entry 456 => no such entry.
> 
> Then maybe we ignore the error, but the moment we enable vlan again,
> the hashing changes back to mac+vid, so the "deleted" static entry
> becomes active again (despite the linux fdb not knowing about it
> anymore).
> 
> And even if the hash is the same, it would mean we cannot interact
> with any preexisting entries for vid!=1 that were added with vlan
> filtering = 1. So we cannot delete them when e.g. removing a port from
> the bridge, or deleting the bridge.
> 
> So the safest would be to remove all static entries before changing
> vlan filtering, and then re-adding them afterwards.
> 
> Best regards,
> Jonas

If you just want to debug whether this is the case or not, then as I
understand, for the moment you only care about static FDB entries on the
CPU port, not on user ports added with 'bridge fdb add ... master static".
If so, these FDB entries are available in the cpu_dp->fdbs list. For
user ports we don't bother keeping track.

Regarding switchdev FDB replay, it's possible but has very high
complexity. The base call would be to switchdev_bridge_port_replay(),
then you'd need to set up two parallel notifier blocks through which
you're informed of the existing objects (not the usual dsa_user_switchdev_notifier
and dsa_user_switchdev_blocking_notifier), whose internal processing is
partly similar (the event filtering and replication) and partly different:
instead of calling dsa_schedule_work() to program the FDB entries to
hardware, you just add them to a list that is kept in a context
structure, which is passed to the caller once the replay is over and the
list is complete.

For the moment, dp->fdbs should be sufficient to prove/disprove a point.