[PATCH net-next v1] ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT

Jiayuan Chen posted 1 patch 1 week, 2 days ago
There is a newer version of this series
net/ipv6/route.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
[PATCH net-next v1] ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
Posted by Jiayuan Chen 1 week, 2 days ago
On PREEMPT_RT kernels, after rt6_get_pcpu_route() returns NULL, the
current task can be preempted. Another task running on the same CPU
may then execute rt6_make_pcpu_route() and successfully install a
pcpu_rt entry. When the first task resumes execution, its cmpxchg()
in rt6_make_pcpu_route() will fail because rt6i_pcpu is no longer
NULL, triggering the BUG_ON(prev). It's easy to reproduce it by adding
mdelay() after rt6_get_pcpu_route().

Using preempt_disable/enable is not appropriate here because
ip6_rt_pcpu_alloc() may sleep.

Fix this by:
1. Adding migrate_disable/enable to ensure consistent per-cpu pointer
   access across potential preemption points.
2. Removing the BUG_ON and instead handling the race gracefully by
   freeing our allocation and returning the existing pcpu_rt when
   cmpxchg() fails.

Link: https://syzkaller.appspot.com/bug?extid=9b35e9bc0951140d13e6
Fixes: 951f788a80ff ("ipv6: fix a BUG in rt6_get_pcpu_route()")
Reported-by: syzbot+9b35e9bc0951140d13e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6918cd88.050a0220.1c914e.0045.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/ipv6/route.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index aee6a10b112a..44c34baad0e4 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1470,7 +1470,16 @@ static struct rt6_info *rt6_make_pcpu_route(struct net *net,
 
 	p = this_cpu_ptr(res->nh->rt6i_pcpu);
 	prev = cmpxchg(p, NULL, pcpu_rt);
-	BUG_ON(prev);
+	if (unlikely(prev)) {
+		/*
+		 * Another task on this CPU already installed a pcpu_rt.
+		 * This can happen on PREEMPT_RT where preemption is possible.
+		 * Free our allocation and return the existing one.
+		 */
+		dst_dev_put(&pcpu_rt->dst);
+		dst_release(&pcpu_rt->dst);
+		return prev;
+	}
 
 	if (res->f6i->fib6_destroying) {
 		struct fib6_info *from;
@@ -2299,11 +2308,13 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
 	} else {
 		/* Get a percpu copy */
 		local_bh_disable();
+		migrate_disable();
 		rt = rt6_get_pcpu_route(&res);
 
 		if (!rt)
 			rt = rt6_make_pcpu_route(net, &res);
 
+		migrate_enable();
 		local_bh_enable();
 	}
 out:
-- 
2.43.0
Re: [PATCH net-next v1] ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
Posted by Paolo Abeni 19 hours ago
On 12/9/25 1:48 PM, Jiayuan Chen wrote:
> @@ -2299,11 +2308,13 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
>  	} else {
>  		/* Get a percpu copy */
>  		local_bh_disable();
> +		migrate_disable();
>  		rt = rt6_get_pcpu_route(&res);
>  
>  		if (!rt)
>  			rt = rt6_make_pcpu_route(net, &res);
>  
> +		migrate_enable();

AFAICS, this part is not needed: local_bh_disable() ensures migrating is
already disabled, if !CONFIG_PREEMPT_RT_NEEDS_BH_LOCK or preemption is
disabled, when CONFIG_PREEMPT_RT_NEEDS_BH_LOCK==y

Side note: this patch looks suitable for the 'net' tree, please change
the subj prefix accordingly in the next revision.

Cheers,

Paolo
Re: [PATCH net-next v1] ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
Posted by Steven Rostedt 11 hours ago
On Thu, 18 Dec 2025 13:25:31 +0100
Paolo Abeni <pabeni@redhat.com> wrote:

> AFAICS, this part is not needed: local_bh_disable() ensures migrating is
> already disabled, if !CONFIG_PREEMPT_RT_NEEDS_BH_LOCK or preemption is
> disabled, when CONFIG_PREEMPT_RT_NEEDS_BH_LOCK==y

As the code has this:

	/* First entry of a task into a BH disabled section? */
	if (!current->softirq_disable_cnt) {
		if (preemptible()) {
			if (IS_ENABLED(CONFIG_PREEMPT_RT_NEEDS_BH_LOCK))
				local_lock(&softirq_ctrl.lock);
			else
				migrate_disable();

			/* Required to meet the RCU bottomhalf requirements. */
			rcu_read_lock();
		} else {
			DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt));
		}
	}

It looks as though migration will always be disabled (local_lock() also
disables migration). It will warn if preemption is disabled.

But yeah, the added migrate_disable() is not needed.

-- Steve