[PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg

Hangbin Liu posted 1 patch 6 days, 14 hours ago
There is a newer version of this series
net/ipv6/ip6_fib.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
[PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg
Posted by Hangbin Liu 6 days, 14 hours ago
fib6_metric_set() may be called concurrently from softirq context without
holding the FIB table lock. A typical path is:

  ndisc_router_discovery()
    spin_unlock_bh(&table->tb6_lock)        <- lock released
    fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call

When two CPUs process Router Advertisement packets for the same router
simultaneously, they can both arrive at fib6_metric_set() with the same
fib6_info pointer whose fib6_metrics still points to dst_default_metrics.

  if (f6i->fib6_metrics == &dst_default_metrics) {   /* both CPUs: true */
      struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
      refcount_set(&p->refcnt, 1);
      f6i->fib6_metrics = p;   /* CPU1 overwrites CPU0's p -> p0 leaked */
  }

The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
to it anywhere in memory, producing a kmemleak report:

  unreferenced object 0xff1100025aca1400 (size 96):
    comm "softirq", pid 0, jiffies 4299271239
    backtrace:
      kmalloc_trace+0x28a/0x380
      fib6_metric_set+0xcd/0x180
      ndisc_router_discovery+0x12dc/0x24b0
      icmpv6_rcv+0xc16/0x1360

Fix this by:
 - Set val for p->metrics before published via cmpxchg() so the metrics
   value is ready before the pointer becomes visible to other CPUs.
 - Replace the plain pointer store with cmpxchg() and free the allocation
   safely when competition failed.
 - Add READ_ONCE()/WRITE_ONCE() for metrics[] setting in the non-default
   metrics path to prevent compiler-based data races.

Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
Changes in v2:
- Set val for p->metrics before published via cmpxchg() (Eric Dumazet)
- Add READ_ONCE()/WRITE_ONCE() for metrics[] setting (Jiayuan Chen)
- Link to v1: https://lore.kernel.org/r/20260326-b4-fib6_metric_set-kmemleak-v1-1-c89fc1b312c0@gmail.com
---
 net/ipv6/ip6_fib.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index dd26657b6a4a..2a7cc33fbcef 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -730,17 +730,24 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
 	if (!f6i)
 		return;
 
-	if (f6i->fib6_metrics == &dst_default_metrics) {
+	if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
+		struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
 		struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
 
 		if (!p)
 			return;
 
+		p->metrics[metric - 1] = val;
 		refcount_set(&p->refcnt, 1);
-		f6i->fib6_metrics = p;
+		if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
+			kfree(p);
+		else
+			return;
 	}
 
-	f6i->fib6_metrics->metrics[metric - 1] = val;
+	struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
+
+	WRITE_ONCE(m->metrics[metric - 1], val);
 }
 
 /*

---
base-commit: c4ea7d8907cf72b259bf70bd8c2e791e1c4ff70f
change-id: 20260326-b4-fib6_metric_set-kmemleak-7aa51978284a

Best regards,
-- 
Hangbin Liu <liuhangbin@gmail.com>
Re: [PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg
Posted by Jakub Kicinski 2 days, 16 hours ago
On Fri, 27 Mar 2026 10:24:47 +0800 Hangbin Liu wrote:
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -730,17 +730,24 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
>  	if (!f6i)
>  		return;
>  
> -	if (f6i->fib6_metrics == &dst_default_metrics) {
> +	if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> +		struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;

Why does this exist? To cast away the const?

>  		struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>  
>  		if (!p)
>  			return;
>  
> +		p->metrics[metric - 1] = val;
>  		refcount_set(&p->refcnt, 1);
> -		f6i->fib6_metrics = p;
> +		if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> +			kfree(p);
> +		else
> +			return;
>  	}
>  
> -	f6i->fib6_metrics->metrics[metric - 1] = val;
> +	struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);

No variable declarations in the middle of a function please.

> +	WRITE_ONCE(m->metrics[metric - 1], val);
>  }
Re: [PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg
Posted by Hangbin Liu 2 days, 16 hours ago
On Mon, Mar 30, 2026 at 05:46:28PM -0700, Jakub Kicinski wrote:
> On Fri, 27 Mar 2026 10:24:47 +0800 Hangbin Liu wrote:
> > --- a/net/ipv6/ip6_fib.c
> > +++ b/net/ipv6/ip6_fib.c
> > @@ -730,17 +730,24 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> >  	if (!f6i)
> >  		return;
> >  
> > -	if (f6i->fib6_metrics == &dst_default_metrics) {
> > +	if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> > +		struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> 
> Why does this exist? To cast away the const?

Yes, cause cmpxchg doesn't accept const type.

> 
> >  		struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> >  
> >  		if (!p)
> >  			return;
> >  
> > +		p->metrics[metric - 1] = val;
> >  		refcount_set(&p->refcnt, 1);
> > -		f6i->fib6_metrics = p;
> > +		if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > +			kfree(p);
> > +		else
> > +			return;
> >  	}
> >  
> > -	f6i->fib6_metrics->metrics[metric - 1] = val;
> > +	struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
> 
> No variable declarations in the middle of a function please.

Oh, I thought it's OK now since kernel supports C99...

I will fix it.

Thanks
Hangbin
Re: [PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg
Posted by Jiayuan Chen 5 days, 5 hours ago
On 3/27/26 10:24 AM, Hangbin Liu wrote:
> fib6_metric_set() may be called concurrently from softirq context without
> holding the FIB table lock. A typical path is:
>
>    ndisc_router_discovery()
>      spin_unlock_bh(&table->tb6_lock)        <- lock released
>      fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
>
> When two CPUs process Router Advertisement packets for the same router
> simultaneously, they can both arrive at fib6_metric_set() with the same
> fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
>
>    if (f6i->fib6_metrics == &dst_default_metrics) {   /* both CPUs: true */
>        struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>        refcount_set(&p->refcnt, 1);
>        f6i->fib6_metrics = p;   /* CPU1 overwrites CPU0's p -> p0 leaked */
>    }
>
> The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
> to it anywhere in memory, producing a kmemleak report:
>
>    unreferenced object 0xff1100025aca1400 (size 96):
>      comm "softirq", pid 0, jiffies 4299271239
>      backtrace:
>        kmalloc_trace+0x28a/0x380
>        fib6_metric_set+0xcd/0x180
>        ndisc_router_discovery+0x12dc/0x24b0
>        icmpv6_rcv+0xc16/0x1360
>
> Fix this by:
>   - Set val for p->metrics before published via cmpxchg() so the metrics
>     value is ready before the pointer becomes visible to other CPUs.
>   - Replace the plain pointer store with cmpxchg() and free the allocation
>     safely when competition failed.
>   - Add READ_ONCE()/WRITE_ONCE() for metrics[] setting in the non-default
>     metrics path to prevent compiler-based data races.
>
> Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> Reported-by: Fei Liu <feliu@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>

https://sashiko.dev/#/patchset/20260327-b4-fib6_metric_set-kmemleak-v2-1-366b2c78b5c2%40gmail.com


The concern about reader paths (e.g., ip_dst_init_metrics, fib6_pmtu) 
lacking READ_ONCE()
annotations is valid — if the compiler reloads from->fib6_metrics after 
inlining, it could produce
an inconsistent pointer/flags combination in dst->_metrics, potentially 
leading to a refcount_dec
on the read-only dst_default_metrics.

However, this is a pre-existing issue that exists before this patch.
The plain store f6i->fib6_metrics = p in the original code has the same 
read-side race.
This patch focuses on fixing the writer-side data race that causes 
kmemleak, and it
does so correctly.

BTW, please consider moving the declaration of m to the top of the 
function if you have a next version


Re: [PATCH net v2] ipv6: fix data race in fib6_metric_set() using cmpxchg
Posted by Hangbin Liu 2 days, 15 hours ago
On Sat, Mar 28, 2026 at 07:22:48PM +0800, Jiayuan Chen wrote:
> https://sashiko.dev/#/patchset/20260327-b4-fib6_metric_set-kmemleak-v2-1-366b2c78b5c2%40gmail.com
> 
> 
> The concern about reader paths (e.g., ip_dst_init_metrics, fib6_pmtu)
> lacking READ_ONCE()
> annotations is valid — if the compiler reloads from->fib6_metrics after
> inlining, it could produce
> an inconsistent pointer/flags combination in dst->_metrics, potentially
> leading to a refcount_dec
> on the read-only dst_default_metrics.

Thanks, I will fix the reader path separately in case I missed anything and
slow down this one's process.

Thanks
Hangbin