[v1] net/ipv6: allow device-only routes via the multipath API

[PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by azey 2 months, 3 weeks ago

At some point after b5d2d75e079a ("net/ipv6: Do not allow device only
routes via the multipath API"), the IPv6 stack was updated such that
device-only multipath routes can be installed and work correctly, but
still weren't allowed in the code.

This change removes the has_gateway check from rtm_to_fib6_multipath_config()
and the fib_nh_gw_family check from rt6_qualify_for_ecmp(), allowing
device-only multipath routes to be installed again.

Signed-off-by: azey <me@azey.net>
---

I tested this on a VM with two wireguard interfaces, and it seems to
work as expected. It also causes fe80::/64 and ff00::/8 to be installed as
multipath routes if there are multiple interfaces, but from my (somewhat
limited) testing that doesn't cause any issues.

I'm also not completely sure whether there are any other places in the
code that assume multipath nexthops must have a gateway addr, but I
didn't immediately find any.

PS: This is my very first contribution to the kernel (and indeed first time
sending a patch via mail), so sorry in advance if I messed anything up.
---
 include/net/ip6_route.h | 3 +--
 net/ipv6/route.c        | 6 ------
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 7c5512baa4b2..07e131f9fcf5 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -73,8 +73,7 @@ static inline bool rt6_need_strict(const struct in6_addr *daddr)
 static inline bool rt6_qualify_for_ecmp(const struct fib6_info *f6i)
 {
 	/* the RTF_ADDRCONF flag filters out RA's */
-	return !(f6i->fib6_flags & RTF_ADDRCONF) && !f6i->nh &&
-		f6i->fib6_nh->fib_nh_gw_family;
+	return !(f6i->fib6_flags & RTF_ADDRCONF) && !f6i->nh;
 }
 
 void ip6_route_input(struct sk_buff *skb);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index aee6a10b112a..40763b90e22c 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -5138,12 +5138,6 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg,
 			}
 		}
 
-		if (newroute && (cfg->fc_nh_id || !has_gateway)) {
-			NL_SET_ERR_MSG(extack,
-				       "Device only routes can not be added for IPv6 using the multipath API.");
-			return -EINVAL;
-		}
-
 		rtnh = rtnh_next(rtnh, &remaining);
 	} while (rtnh_ok(rtnh, remaining));
 
-- 
2.51.0

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by kernel test robot 2 months, 3 weeks ago

Hi azey,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]
[also build test WARNING on net/main klassert-ipsec/master linus/master v6.18-rc6 next-20251117]
[cannot apply to horms-ipvs/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/azey/net-ipv6-allow-device-only-routes-via-the-multipath-API/20251117-023331
base:   net-next/main
patch link:    https://lore.kernel.org/r/a6vmtv3ylu224fnj5awi6xrgnjoib5r2jm3kny672hemsk5ifi%40ychcxqnmy5us
patch subject: [PATCH] net/ipv6: allow device-only routes via the multipath API
config: i386-randconfig-141-20251117 (https://download.01.org/0day-ci/archive/20251118/202511180742.7iC868V8-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511180742.7iC868V8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511180742.7iC868V8-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/ipv6/route.c: In function 'rtm_to_fib6_multipath_config':
>> net/ipv6/route.c:5122:22: warning: variable 'has_gateway' set but not used [-Wunused-but-set-variable]
    5122 |                 bool has_gateway = cfg->fc_flags & RTF_GATEWAY;
         |                      ^~~~~~~~~~~


vim +/has_gateway +5122 net/ipv6/route.c

86872cb57925c4 Thomas Graf       2006-08-22  5105  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5106  static int rtm_to_fib6_multipath_config(struct fib6_config *cfg,
bd11ff421d36ab Kuniyuki Iwashima 2025-04-17  5107  					struct netlink_ext_ack *extack,
bd11ff421d36ab Kuniyuki Iwashima 2025-04-17  5108  					bool newroute)
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5109  {
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5110  	struct rtnexthop *rtnh;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5111  	int remaining;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5112  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5113  	remaining = cfg->fc_mp_len;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5114  	rtnh = (struct rtnexthop *)cfg->fc_mp;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5115  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5116  	if (!rtnh_ok(rtnh, remaining)) {
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5117  		NL_SET_ERR_MSG(extack, "Invalid nexthop configuration - no valid nexthops");
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5118  		return -EINVAL;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5119  	}
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5120  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5121  	do {
e6f497955fb6a0 Kuniyuki Iwashima 2025-04-17 @5122  		bool has_gateway = cfg->fc_flags & RTF_GATEWAY;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5123  		int attrlen = rtnh_attrlen(rtnh);
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5124  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5125  		if (attrlen > 0) {
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5126  			struct nlattr *nla, *attrs;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5127  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5128  			attrs = rtnh_attrs(rtnh);
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5129  			nla = nla_find(attrs, attrlen, RTA_GATEWAY);
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5130  			if (nla) {
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5131  				if (nla_len(nla) < sizeof(cfg->fc_gateway)) {
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5132  					NL_SET_ERR_MSG(extack,
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5133  						       "Invalid IPv6 address in RTA_GATEWAY");
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5134  					return -EINVAL;
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5135  				}
e6f497955fb6a0 Kuniyuki Iwashima 2025-04-17  5136  
e6f497955fb6a0 Kuniyuki Iwashima 2025-04-17  5137  				has_gateway = true;
e6f497955fb6a0 Kuniyuki Iwashima 2025-04-17  5138  			}
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5139  		}
e6f497955fb6a0 Kuniyuki Iwashima 2025-04-17  5140  
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5141  		rtnh = rtnh_next(rtnh, &remaining);
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5142  	} while (rtnh_ok(rtnh, remaining));
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5143  
f0a56c17e64bb5 Kuniyuki Iwashima 2025-05-15  5144  	return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len, extack);
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5145  }
4cb4861d8c3b3b Kuniyuki Iwashima 2025-04-17  5146  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by David Ahern 2 months, 3 weeks ago

On 11/16/25 11:31 AM, azey wrote:
> At some point after b5d2d75e079a ("net/ipv6: Do not allow device only
> routes via the multipath API"), the IPv6 stack was updated such that
> device-only multipath routes can be installed and work correctly, but
> still weren't allowed in the code.
> 
> This change removes the has_gateway check from rtm_to_fib6_multipath_config()
> and the fib_nh_gw_family check from rt6_qualify_for_ecmp(), allowing
> device-only multipath routes to be installed again.
> 

My recollection is that device only legs of an ECMP route is only valid
with the separate nexthop code. Added Nicholas (author of the original
IPv4 multipath code) to keep me honest.

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by Nicolas Dichtel 2 months, 3 weeks ago

Le 17/11/2025 à 02:57, David Ahern a écrit :
> On 11/16/25 11:31 AM, azey wrote:
>> At some point after b5d2d75e079a ("net/ipv6: Do not allow device only
>> routes via the multipath API"), the IPv6 stack was updated such that
>> device-only multipath routes can be installed and work correctly, but
>> still weren't allowed in the code.
>>
>> This change removes the has_gateway check from rtm_to_fib6_multipath_config()
>> and the fib_nh_gw_family check from rt6_qualify_for_ecmp(), allowing
>> device-only multipath routes to be installed again.
>>
> 
> My recollection is that device only legs of an ECMP route is only valid
> with the separate nexthop code. Added Nicholas (author of the original
> IPv4 multipath code) to keep me honest.
If I remember well, it was to avoid merging connected routes to ECMP routes.
For example, fe80:: but also if two interfaces have an address in the same
prefix. With the current code, the last route will always be used. With this
patch, packets will be distributed across the two interfaces, right?
If yes, it may cause regression on some setups.

Regards,
Nicolas

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by azey 2 months, 3 weeks ago

On 2025-11-18 10:05:55, +0100 Nicolas Dichtel wrote:
> If I remember well, it was to avoid merging connected routes to ECMP routes.
> For example, fe80:: but also if two interfaces have an address in the same
> prefix. With the current code, the last route will always be used. With this
> patch, packets will be distributed across the two interfaces, right?
> If yes, it may cause regression on some setups.

Thanks! Yes, with this patch routes with the same destination and metric automatically
become multipath. From my testing, for link-locals this shouldn't make a difference
as the interface must always be specified with % anyway.

For non-LL addresses, this could indeed cause a regression in obscure setups. In my
opinion though, I feel that it is very unlikely anyone who has two routes with the
same prefix and metric (which AFAIK, isn't really a supported configuration without
ECMP anyway) relies on this quirk. The most plausible setup relying on this I can
think of would be a server with two interfaces on the same L2 segment, and a
firewall somewhere that only allows the source address of one interface through.

IMO, setups like that are more of a misconfiguration than a "practical use case"
that'd make this a real regression, but I'd completely understand if it'd be enough
to block this.

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by David Ahern 2 months, 3 weeks ago

On 11/18/25 4:00 AM, azey wrote:
> On 2025-11-18 10:05:55, +0100 Nicolas Dichtel wrote:
>> If I remember well, it was to avoid merging connected routes to ECMP routes.
>> For example, fe80:: but also if two interfaces have an address in the same
>> prefix. With the current code, the last route will always be used. With this
>> patch, packets will be distributed across the two interfaces, right?
>> If yes, it may cause regression on some setups.
> 
> Thanks! Yes, with this patch routes with the same destination and metric automatically
> become multipath. From my testing, for link-locals this shouldn't make a difference
> as the interface must always be specified with % anyway.
> 
> For non-LL addresses, this could indeed cause a regression in obscure setups. In my
> opinion though, I feel that it is very unlikely anyone who has two routes with the
> same prefix and metric (which AFAIK, isn't really a supported configuration without
> ECMP anyway) relies on this quirk. The most plausible setup relying on this I can
> think of would be a server with two interfaces on the same L2 segment, and a
> firewall somewhere that only allows the source address of one interface through.
> 
> IMO, setups like that are more of a misconfiguration than a "practical use case"
> that'd make this a real regression, but I'd completely understand if it'd be enough
> to block this.

There is really no reason to take a risk of a regression. If someone
wants ecmp with device only nexthops, then use the new nexthop infra to
do it.

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by azey 2 months, 3 weeks ago

On 2025-11-18 17:04:38 +0100,  David Ahern <dsahern@kernel.org> wrote:
> There is really no reason to take a risk of a regression. If someone
> wants ecmp with device only nexthops, then use the new nexthop infra to
> do it.

My initial reason was that device-only ECMP via `ip route` works with IPv4
but not IPv6, so I thought it'd make sense to unify functionality - but if
this is final I won't argue any further.

Thanks again for the reviews, and sorry for potentially wasting your time.

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by David Ahern 2 months, 3 weeks ago

On 11/18/25 9:47 AM, azey wrote:
> On 2025-11-18 17:04:38 +0100,  David Ahern <dsahern@kernel.org> wrote:
>> There is really no reason to take a risk of a regression. If someone
>> wants ecmp with device only nexthops, then use the new nexthop infra to
>> do it.
> 
> My initial reason was that device-only ECMP via `ip route` works with IPv4
> but not IPv6, so I thought it'd make sense to unify functionality - but if
> this is final I won't argue any further.
> 

There was a push many years ago to align v4 and v6 as much as possible.
Certain areas - like ipv6 multipath - proved to be too difficult and
ended up causing regressions.

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by Nicolas Dichtel 2 months, 3 weeks ago

Le 18/11/2025 à 17:04, David Ahern a écrit :
> On 11/18/25 4:00 AM, azey wrote:
>> On 2025-11-18 10:05:55, +0100 Nicolas Dichtel wrote:
>>> If I remember well, it was to avoid merging connected routes to ECMP routes.
>>> For example, fe80:: but also if two interfaces have an address in the same
>>> prefix. With the current code, the last route will always be used. With this
>>> patch, packets will be distributed across the two interfaces, right?
>>> If yes, it may cause regression on some setups.
>>
>> Thanks! Yes, with this patch routes with the same destination and metric automatically
>> become multipath. From my testing, for link-locals this shouldn't make a difference
>> as the interface must always be specified with % anyway.
>>
>> For non-LL addresses, this could indeed cause a regression in obscure setups. In my
Having an address in the same prefix on two interfaces is not an "obscure setups".

>> opinion though, I feel that it is very unlikely anyone who has two routes with the
>> same prefix and metric (which AFAIK, isn't really a supported configuration without
>> ECMP anyway) relies on this quirk. The most plausible setup relying on this I can
>> think of would be a server with two interfaces on the same L2 segment, and a
>> firewall somewhere that only allows the source address of one interface through.
>>
>> IMO, setups like that are more of a misconfiguration than a "practical use case"
>> that'd make this a real regression, but I'd completely understand if it'd be enough
>> to block this.
> 
> There is really no reason to take a risk of a regression. If someone
> wants ecmp with device only nexthops, then use the new nexthop infra to
> do it.
+1

Re: [PATCH] net/ipv6: allow device-only routes via the multipath API

Posted by azey 2 months, 3 weeks ago

On 2025-11-18 17:41:14 +0100,  Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> Having an address in the same prefix on two interfaces is not an "obscure setups".

Sorry, just a clarification on this since I didn't get the email in time before sending
my reply to David: I meant specifically the case where someone relies on the last route
always being selected in this scenario, setups that don't rely on that shouldn't be affected.