[PATCH v1] net:bonding:support balance-alb interface with vlan to bridge

Sun Shouxin posted 1 patch 3 years, 8 months ago
There is a newer version of this series
drivers/net/bonding/bond_alb.c | 8 ++++++++
1 file changed, 8 insertions(+)
[PATCH v1] net:bonding:support balance-alb interface with vlan to bridge
Posted by Sun Shouxin 3 years, 8 months ago
In my test, balance-alb bonding with two slaves eth0 and eth1,
and then Bond0.150 is created with vlan id attached bond0.
After adding bond0.150 into one linux bridge, I noted that Bond0,
bond0.150 and  bridge were assigned to the same MAC as eth0.
Once bond0.150 receives a packet whose dest IP is bridge's
and dest MAC is eth1's, the linux bridge cannot process it as expected.
The patch fix the issue, and diagram as below:

eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
                      |
                   bond0.150(mac:eth0_mac)
                      |
                   bridge(ip:br_ip, mac:eth0_mac)--other port

Suggested-by: Hu Yadi <huyd12@chinatelecom.cn>
Signed-off-by: Sun Shouxin <sunshouxin@chinatelecom.cn>
---
 drivers/net/bonding/bond_alb.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 007d43e46dcb..0dea04f00f12 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -654,6 +654,7 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
 {
 	struct slave *tx_slave = NULL;
 	struct arp_pkt *arp;
+	struct net_device *dev;
 
 	if (!pskb_network_may_pull(skb, sizeof(*arp)))
 		return NULL;
@@ -665,6 +666,13 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
 	if (!bond_slave_has_mac_rx(bond, arp->mac_src))
 		return NULL;
 
+	dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
+	if (dev) {
+		if (netif_is_bridge_master(dev)) {
+			return NULL;
+		}
+	}
+
 	if (arp->op_code == htons(ARPOP_REPLY)) {
 		/* the arp must be sent on the selected rx channel */
 		tx_slave = rlb_choose_channel(skb, bond, arp);
-- 
2.27.0
Re: [PATCH v1] net:bonding:support balance-alb interface with vlan to bridge
Posted by Nikolay Aleksandrov 3 years, 8 months ago
On 08/08/2022 12:41, Sun Shouxin wrote:
> In my test, balance-alb bonding with two slaves eth0 and eth1,
> and then Bond0.150 is created with vlan id attached bond0.
> After adding bond0.150 into one linux bridge, I noted that Bond0,
> bond0.150 and  bridge were assigned to the same MAC as eth0.
> Once bond0.150 receives a packet whose dest IP is bridge's
> and dest MAC is eth1's, the linux bridge cannot process it as expected.
> The patch fix the issue, and diagram as below:
> 
> eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
>                       |
>                    bond0.150(mac:eth0_mac)
>                       |
>                    bridge(ip:br_ip, mac:eth0_mac)--other port
> 
> Suggested-by: Hu Yadi <huyd12@chinatelecom.cn>
> Signed-off-by: Sun Shouxin <sunshouxin@chinatelecom.cn>
> ---
>  drivers/net/bonding/bond_alb.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
> index 007d43e46dcb..0dea04f00f12 100644
> --- a/drivers/net/bonding/bond_alb.c
> +++ b/drivers/net/bonding/bond_alb.c
> @@ -654,6 +654,7 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
>  {
>  	struct slave *tx_slave = NULL;
>  	struct arp_pkt *arp;
> +	struct net_device *dev;

reverse xmas tree order

>  
>  	if (!pskb_network_may_pull(skb, sizeof(*arp)))
>  		return NULL;
> @@ -665,6 +666,13 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
>  	if (!bond_slave_has_mac_rx(bond, arp->mac_src))
>  		return NULL;
>  
> +	dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
> +	if (dev) {
> +		if (netif_is_bridge_master(dev)) {
> +			return NULL;
> +		}

nit: the {} aren't needed

> +	}
> +
>  	if (arp->op_code == htons(ARPOP_REPLY)) {
>  		/* the arp must be sent on the selected rx channel */
>  		tx_slave = rlb_choose_channel(skb, bond, arp);

Aside from the small cosmetic comments, have you tried adding the second mac address
as permanent in the bridge?
i.e.: 
$ bridge fdb add <eth1_mac> dev bond0.150 master permanent

That should fix your problem without any bonding hacks.

Cheers,
 Nik
Re: [PATCH v1] net:bonding:support balance-alb interface with vlan to bridge
Posted by Nikolay Aleksandrov 3 years, 8 months ago
On 08/08/2022 15:44, Nikolay Aleksandrov wrote:
> On 08/08/2022 12:41, Sun Shouxin wrote:
>> In my test, balance-alb bonding with two slaves eth0 and eth1,
>> and then Bond0.150 is created with vlan id attached bond0.
>> After adding bond0.150 into one linux bridge, I noted that Bond0,
>> bond0.150 and  bridge were assigned to the same MAC as eth0.
>> Once bond0.150 receives a packet whose dest IP is bridge's
>> and dest MAC is eth1's, the linux bridge cannot process it as expected.
>> The patch fix the issue, and diagram as below:
>>
>> eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
>>                       |
>>                    bond0.150(mac:eth0_mac)
>>                       |
>>                    bridge(ip:br_ip, mac:eth0_mac)--other port
>>
>> Suggested-by: Hu Yadi <huyd12@chinatelecom.cn>
>> Signed-off-by: Sun Shouxin <sunshouxin@chinatelecom.cn>
>> ---
>>  drivers/net/bonding/bond_alb.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>> index 007d43e46dcb..0dea04f00f12 100644
>> --- a/drivers/net/bonding/bond_alb.c
>> +++ b/drivers/net/bonding/bond_alb.c
>> @@ -654,6 +654,7 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
>>  {
>>  	struct slave *tx_slave = NULL;
>>  	struct arp_pkt *arp;
>> +	struct net_device *dev;
> 
> reverse xmas tree order
> 
>>  
>>  	if (!pskb_network_may_pull(skb, sizeof(*arp)))
>>  		return NULL;
>> @@ -665,6 +666,13 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
>>  	if (!bond_slave_has_mac_rx(bond, arp->mac_src))
>>  		return NULL;
>>  
>> +	dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
>> +	if (dev) {
>> +		if (netif_is_bridge_master(dev)) {
>> +			return NULL;
>> +		}
> 
> nit: the {} aren't needed
> 
>> +	}
>> +
>>  	if (arp->op_code == htons(ARPOP_REPLY)) {
>>  		/* the arp must be sent on the selected rx channel */
>>  		tx_slave = rlb_choose_channel(skb, bond, arp);
> 
> Aside from the small cosmetic comments, have you tried adding the second mac address
> as permanent in the bridge?
> i.e.: 
> $ bridge fdb add <eth1_mac> dev bond0.150 master permanent
> 
> That should fix your problem without any bonding hacks.
> 
> Cheers,
>  Nik

Ah, I just found your original submission and understood the problem better.
The fix sounds good, as Jay explained there, but the commit message can use a bit
more explanation of what exactly is wrong and why the bond shouldn't load balance
these. :)

Anyway, nevermind my last comment.