net/ipv6/ip6_fib.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
fib6_metric_set() may be called concurrently from softirq context without
holding the FIB table lock. A typical path is:
ndisc_router_discovery()
spin_unlock_bh(&table->tb6_lock) <- lock released
fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
When two CPUs process Router Advertisement packets for the same router
simultaneously, they can both arrive at fib6_metric_set() with the same
fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
refcount_set(&p->refcnt, 1);
f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
}
The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
to it anywhere in memory, producing a kmemleak report:
unreferenced object 0xff1100025aca1400 (size 96):
comm "softirq", pid 0, jiffies 4299271239
backtrace:
kmalloc_trace+0x28a/0x380
fib6_metric_set+0xcd/0x180
ndisc_router_discovery+0x12dc/0x24b0
icmpv6_rcv+0xc16/0x1360
Fix this by replacing the plain pointer store with cmpxchg() and free
the allocation safely when competition failed.
Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
net/ipv6/ip6_fib.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index dd26657b6a4a..64de761f40d5 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
if (!f6i)
return;
- if (f6i->fib6_metrics == &dst_default_metrics) {
+ if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
+ struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
if (!p)
return;
refcount_set(&p->refcnt, 1);
- f6i->fib6_metrics = p;
+ if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
+ kfree(p);
}
f6i->fib6_metrics->metrics[metric - 1] = val;
---
base-commit: c4ea7d8907cf72b259bf70bd8c2e791e1c4ff70f
change-id: 20260326-b4-fib6_metric_set-kmemleak-7aa51978284a
Best regards,
--
Hangbin Liu <liuhangbin@gmail.com>
On Wed, Mar 25, 2026 at 9:22 PM Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> fib6_metric_set() may be called concurrently from softirq context without
> holding the FIB table lock. A typical path is:
>
> ndisc_router_discovery()
> spin_unlock_bh(&table->tb6_lock) <- lock released
> fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
>
> When two CPUs process Router Advertisement packets for the same router
> simultaneously, they can both arrive at fib6_metric_set() with the same
> fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
>
> if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> refcount_set(&p->refcnt, 1);
> f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
> }
>
> The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
> to it anywhere in memory, producing a kmemleak report:
>
> unreferenced object 0xff1100025aca1400 (size 96):
> comm "softirq", pid 0, jiffies 4299271239
> backtrace:
> kmalloc_trace+0x28a/0x380
> fib6_metric_set+0xcd/0x180
> ndisc_router_discovery+0x12dc/0x24b0
> icmpv6_rcv+0xc16/0x1360
>
> Fix this by replacing the plain pointer store with cmpxchg() and free
> the allocation safely when competition failed.
>
> Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> Reported-by: Fei Liu <feliu@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
> net/ipv6/ip6_fib.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index dd26657b6a4a..64de761f40d5 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> if (!f6i)
> return;
>
> - if (f6i->fib6_metrics == &dst_default_metrics) {
> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>
> if (!p)
> return;
>
> refcount_set(&p->refcnt, 1);
> - f6i->fib6_metrics = p;
> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> + kfree(p);
> }
>
The following line should happen before the cmpxchg(),
->metrics[X] accesses also need READ_ONCE()/WRITE_ONCE() annotations.
> f6i->fib6_metrics->metrics[metric - 1] = val;
>
> ---
> base-commit: c4ea7d8907cf72b259bf70bd8c2e791e1c4ff70f
> change-id: 20260326-b4-fib6_metric_set-kmemleak-7aa51978284a
>
> Best regards,
> --
> Hangbin Liu <liuhangbin@gmail.com>
>
On Thu, Mar 26, 2026 at 05:05:57AM -0700, Eric Dumazet wrote:
> > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> > index dd26657b6a4a..64de761f40d5 100644
> > --- a/net/ipv6/ip6_fib.c
> > +++ b/net/ipv6/ip6_fib.c
> > @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> > if (!f6i)
> > return;
> >
> > - if (f6i->fib6_metrics == &dst_default_metrics) {
> > + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> > + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> >
> > if (!p)
> > return;
> >
> > refcount_set(&p->refcnt, 1);
> > - f6i->fib6_metrics = p;
> > + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > + kfree(p);
> > }
> >
>
> The following line should happen before the cmpxchg(),
> ->metrics[X] accesses also need READ_ONCE()/WRITE_ONCE() annotations.
Hi Eric,
Jiayuan also suggested to using READ_ONCE()/WRITE_ONCE() for metrics[X]
accesses. But I don't get why this line should happen before the cmpxchg(),
Would you please help explain?
>
>
> > f6i->fib6_metrics->metrics[metric - 1] = val;
> >
Thanks
Hangbin
On 3/26/26 9:13 PM, Hangbin Liu wrote:
> On Thu, Mar 26, 2026 at 05:05:57AM -0700, Eric Dumazet wrote:
>>> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
>>> index dd26657b6a4a..64de761f40d5 100644
>>> --- a/net/ipv6/ip6_fib.c
>>> +++ b/net/ipv6/ip6_fib.c
>>> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
>>> if (!f6i)
>>> return;
>>>
>>> - if (f6i->fib6_metrics == &dst_default_metrics) {
>>> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
>>> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
>>> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>>>
>>> if (!p)
>>> return;
>>>
>>> refcount_set(&p->refcnt, 1);
>>> - f6i->fib6_metrics = p;
>>> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
>>> + kfree(p);
>>> }
>>>
>> The following line should happen before the cmpxchg(),
>> ->metrics[X] accesses also need READ_ONCE()/WRITE_ONCE() annotations.
> Hi Eric,
>
> Jiayuan also suggested to using READ_ONCE()/WRITE_ONCE() for metrics[X]
> accesses. But I don't get why this line should happen before the cmpxchg(),
> Would you please help explain?
I think what Eric means is something like this:
...
struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
if (!p)
return;
p->metrics[metric - 1] = val;
refcount_set(&p->refcnt, 1);
if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
kfree(p);
else
return;
}
}
m = READ_ONCE(f6i->fib6_metrics);
WRITE_ONCE(m->metrics[metric - 1], val);
Since p is private data before being published via cmpxchg(), we can
safely initialize its metrics beforehand. This way we don't need to
worry about concurrent access to f6i->fib6_metrics->metrics[] during
initialization. Right?
On Thu, Mar 26, 2026 at 6:44 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
>
> On 3/26/26 9:13 PM, Hangbin Liu wrote:
> > On Thu, Mar 26, 2026 at 05:05:57AM -0700, Eric Dumazet wrote:
> >>> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> >>> index dd26657b6a4a..64de761f40d5 100644
> >>> --- a/net/ipv6/ip6_fib.c
> >>> +++ b/net/ipv6/ip6_fib.c
> >>> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> >>> if (!f6i)
> >>> return;
> >>>
> >>> - if (f6i->fib6_metrics == &dst_default_metrics) {
> >>> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> >>> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> >>> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> >>>
> >>> if (!p)
> >>> return;
> >>>
> >>> refcount_set(&p->refcnt, 1);
> >>> - f6i->fib6_metrics = p;
> >>> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> >>> + kfree(p);
> >>> }
> >>>
> >> The following line should happen before the cmpxchg(),
> >> ->metrics[X] accesses also need READ_ONCE()/WRITE_ONCE() annotations.
> > Hi Eric,
> >
> > Jiayuan also suggested to using READ_ONCE()/WRITE_ONCE() for metrics[X]
> > accesses. But I don't get why this line should happen before the cmpxchg(),
> > Would you please help explain?
>
>
> I think what Eric means is something like this:
>
>
> ...
> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>
> if (!p)
> return;
>
> p->metrics[metric - 1] = val;
> refcount_set(&p->refcnt, 1);
> if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> kfree(p);
> else
> return;
> }
> }
>
> m = READ_ONCE(f6i->fib6_metrics);
> WRITE_ONCE(m->metrics[metric - 1], val);
>
>
> Since p is private data before being published via cmpxchg(), we can
> safely initialize its metrics beforehand. This way we don't need to
> worry about concurrent access to f6i->fib6_metrics->metrics[] during
> initialization. Right?
Yes. Think about RCU (Read Copy Update) rules.
We allocate an object, and populate it, then make sure changes are
committed, before publishing the new pointer.
Othewise an other cpu could read a 0 metric, while we wanted something else.
On Thu, Mar 26, 2026 at 07:01:36AM -0700, Eric Dumazet wrote:
> On Thu, Mar 26, 2026 at 6:44 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >
> >
> > On 3/26/26 9:13 PM, Hangbin Liu wrote:
> > > On Thu, Mar 26, 2026 at 05:05:57AM -0700, Eric Dumazet wrote:
> > >>> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> > >>> index dd26657b6a4a..64de761f40d5 100644
> > >>> --- a/net/ipv6/ip6_fib.c
> > >>> +++ b/net/ipv6/ip6_fib.c
> > >>> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> > >>> if (!f6i)
> > >>> return;
> > >>>
> > >>> - if (f6i->fib6_metrics == &dst_default_metrics) {
> > >>> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> > >>> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> > >>> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> > >>>
> > >>> if (!p)
> > >>> return;
> > >>>
> > >>> refcount_set(&p->refcnt, 1);
> > >>> - f6i->fib6_metrics = p;
> > >>> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > >>> + kfree(p);
> > >>> }
> > >>>
> > >> The following line should happen before the cmpxchg(),
> > >> ->metrics[X] accesses also need READ_ONCE()/WRITE_ONCE() annotations.
> > > Hi Eric,
> > >
> > > Jiayuan also suggested to using READ_ONCE()/WRITE_ONCE() for metrics[X]
> > > accesses. But I don't get why this line should happen before the cmpxchg(),
> > > Would you please help explain?
> >
> >
> > I think what Eric means is something like this:
> >
> >
> > ...
> > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> >
> > if (!p)
> > return;
> >
> > p->metrics[metric - 1] = val;
> > refcount_set(&p->refcnt, 1);
> > if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > kfree(p);
> > else
> > return;
> > }
> > }
> >
> > m = READ_ONCE(f6i->fib6_metrics);
> > WRITE_ONCE(m->metrics[metric - 1], val);
> >
> >
> > Since p is private data before being published via cmpxchg(), we can
> > safely initialize its metrics beforehand. This way we don't need to
> > worry about concurrent access to f6i->fib6_metrics->metrics[] during
> > initialization. Right?
>
> Yes. Think about RCU (Read Copy Update) rules.
>
> We allocate an object, and populate it, then make sure changes are
> committed, before publishing the new pointer.
>
> Othewise an other cpu could read a 0 metric, while we wanted something else.
Ah, got it. Thanks for the explanation.
Regards
Hangbin
On 3/26/26 12:22 PM, Hangbin Liu wrote:
> fib6_metric_set() may be called concurrently from softirq context without
> holding the FIB table lock. A typical path is:
>
> ndisc_router_discovery()
> spin_unlock_bh(&table->tb6_lock) <- lock released
> fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
>
> When two CPUs process Router Advertisement packets for the same router
> simultaneously, they can both arrive at fib6_metric_set() with the same
> fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
>
> if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> refcount_set(&p->refcnt, 1);
> f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
> }
>
> The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
> to it anywhere in memory, producing a kmemleak report:
>
> unreferenced object 0xff1100025aca1400 (size 96):
> comm "softirq", pid 0, jiffies 4299271239
> backtrace:
> kmalloc_trace+0x28a/0x380
> fib6_metric_set+0xcd/0x180
> ndisc_router_discovery+0x12dc/0x24b0
> icmpv6_rcv+0xc16/0x1360
>
> Fix this by replacing the plain pointer store with cmpxchg() and free
> the allocation safely when competition failed.
>
> Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> Reported-by: Fei Liu <feliu@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
> net/ipv6/ip6_fib.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index dd26657b6a4a..64de761f40d5 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> if (!f6i)
> return;
>
> - if (f6i->fib6_metrics == &dst_default_metrics) {
> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>
> if (!p)
> return;
>
> refcount_set(&p->refcnt, 1);
> - f6i->fib6_metrics = p;
> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> + kfree(p);
> }
>
[...]
> f6i->fib6_metrics->metrics[metric - 1] = val;
Suggest using marked accessors to suppress KCSAN warnings:
struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
WRITE_ONCE(m->metrics[metric - 1], val);
On Thu, Mar 26, 2026 at 02:23:15PM +0800, Jiayuan Chen wrote:
>
> On 3/26/26 12:22 PM, Hangbin Liu wrote:
> > fib6_metric_set() may be called concurrently from softirq context without
> > holding the FIB table lock. A typical path is:
> >
> > ndisc_router_discovery()
> > spin_unlock_bh(&table->tb6_lock) <- lock released
> > fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
> >
> > When two CPUs process Router Advertisement packets for the same router
> > simultaneously, they can both arrive at fib6_metric_set() with the same
> > fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
> >
> > if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
> > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> > refcount_set(&p->refcnt, 1);
> > f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
> > }
> >
> > The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
> > to it anywhere in memory, producing a kmemleak report:
> >
> > unreferenced object 0xff1100025aca1400 (size 96):
> > comm "softirq", pid 0, jiffies 4299271239
> > backtrace:
> > kmalloc_trace+0x28a/0x380
> > fib6_metric_set+0xcd/0x180
> > ndisc_router_discovery+0x12dc/0x24b0
> > icmpv6_rcv+0xc16/0x1360
> >
> > Fix this by replacing the plain pointer store with cmpxchg() and free
> > the allocation safely when competition failed.
> >
> > Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> > Reported-by: Fei Liu <feliu@redhat.com>
> > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> > ---
> > net/ipv6/ip6_fib.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> > index dd26657b6a4a..64de761f40d5 100644
> > --- a/net/ipv6/ip6_fib.c
> > +++ b/net/ipv6/ip6_fib.c
> > @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> > if (!f6i)
> > return;
> > - if (f6i->fib6_metrics == &dst_default_metrics) {
> > + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> > + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> > if (!p)
> > return;
> > refcount_set(&p->refcnt, 1);
> > - f6i->fib6_metrics = p;
> > + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > + kfree(p);
> > }
>
>
> [...]
>
> > f6i->fib6_metrics->metrics[metric - 1] = val;
>
> Suggest using marked accessors to suppress KCSAN warnings:
>
> struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
> WRITE_ONCE(m->metrics[metric - 1], val);
Thanks, I will update this in next version.
Hangbin
On Thu, Mar 26, 2026 at 06:44:43AM +0000, Hangbin Liu wrote:
> On Thu, Mar 26, 2026 at 02:23:15PM +0800, Jiayuan Chen wrote:
> >
> > On 3/26/26 12:22 PM, Hangbin Liu wrote:
> > > fib6_metric_set() may be called concurrently from softirq context without
> > > holding the FIB table lock. A typical path is:
> > >
> > > ndisc_router_discovery()
> > > spin_unlock_bh(&table->tb6_lock) <- lock released
> > > fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
> > >
> > > When two CPUs process Router Advertisement packets for the same router
> > > simultaneously, they can both arrive at fib6_metric_set() with the same
> > > fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
> > >
> > > if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
> > > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> > > refcount_set(&p->refcnt, 1);
> > > f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
> > > }
> > >
> > > The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
> > > to it anywhere in memory, producing a kmemleak report:
> > >
> > > unreferenced object 0xff1100025aca1400 (size 96):
> > > comm "softirq", pid 0, jiffies 4299271239
> > > backtrace:
> > > kmalloc_trace+0x28a/0x380
> > > fib6_metric_set+0xcd/0x180
> > > ndisc_router_discovery+0x12dc/0x24b0
> > > icmpv6_rcv+0xc16/0x1360
> > >
> > > Fix this by replacing the plain pointer store with cmpxchg() and free
> > > the allocation safely when competition failed.
> > >
> > > Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> > > Reported-by: Fei Liu <feliu@redhat.com>
> > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> > > ---
> > > net/ipv6/ip6_fib.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> > > index dd26657b6a4a..64de761f40d5 100644
> > > --- a/net/ipv6/ip6_fib.c
> > > +++ b/net/ipv6/ip6_fib.c
> > > @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
> > > if (!f6i)
> > > return;
> > > - if (f6i->fib6_metrics == &dst_default_metrics) {
> > > + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
> > > + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
> > > struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
> > > if (!p)
> > > return;
> > > refcount_set(&p->refcnt, 1);
> > > - f6i->fib6_metrics = p;
> > > + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
> > > + kfree(p);
> > > }
> >
> >
> > [...]
> >
> > > f6i->fib6_metrics->metrics[metric - 1] = val;
> >
> > Suggest using marked accessors to suppress KCSAN warnings:
> >
> > struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
> > WRITE_ONCE(m->metrics[metric - 1], val);
>
> Thanks, I will update this in next version.
BTW, do we really need to WRITE_ONCE here? What if the `val` are different
on 2 CPUs? This would hide the problem, right?
Thanks
Hangbin
On 3/26/26 3:13 PM, Hangbin Liu wrote:
> On Thu, Mar 26, 2026 at 06:44:43AM +0000, Hangbin Liu wrote:
>> On Thu, Mar 26, 2026 at 02:23:15PM +0800, Jiayuan Chen wrote:
>>> On 3/26/26 12:22 PM, Hangbin Liu wrote:
>>>> fib6_metric_set() may be called concurrently from softirq context without
>>>> holding the FIB table lock. A typical path is:
>>>>
>>>> ndisc_router_discovery()
>>>> spin_unlock_bh(&table->tb6_lock) <- lock released
>>>> fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
>>>>
>>>> When two CPUs process Router Advertisement packets for the same router
>>>> simultaneously, they can both arrive at fib6_metric_set() with the same
>>>> fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
>>>>
>>>> if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
>>>> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>>>> refcount_set(&p->refcnt, 1);
>>>> f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
>>>> }
>>>>
>>>> The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
>>>> to it anywhere in memory, producing a kmemleak report:
>>>>
>>>> unreferenced object 0xff1100025aca1400 (size 96):
>>>> comm "softirq", pid 0, jiffies 4299271239
>>>> backtrace:
>>>> kmalloc_trace+0x28a/0x380
>>>> fib6_metric_set+0xcd/0x180
>>>> ndisc_router_discovery+0x12dc/0x24b0
>>>> icmpv6_rcv+0xc16/0x1360
>>>>
>>>> Fix this by replacing the plain pointer store with cmpxchg() and free
>>>> the allocation safely when competition failed.
>>>>
>>>> Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
>>>> Reported-by: Fei Liu <feliu@redhat.com>
>>>> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
>>>> ---
>>>> net/ipv6/ip6_fib.c | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
>>>> index dd26657b6a4a..64de761f40d5 100644
>>>> --- a/net/ipv6/ip6_fib.c
>>>> +++ b/net/ipv6/ip6_fib.c
>>>> @@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
>>>> if (!f6i)
>>>> return;
>>>> - if (f6i->fib6_metrics == &dst_default_metrics) {
>>>> + if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
>>>> + struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
>>>> struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
>>>> if (!p)
>>>> return;
>>>> refcount_set(&p->refcnt, 1);
>>>> - f6i->fib6_metrics = p;
>>>> + if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
>>>> + kfree(p);
>>>> }
>>>
>>> [...]
>>>
>>>> f6i->fib6_metrics->metrics[metric - 1] = val;
>>> Suggest using marked accessors to suppress KCSAN warnings:
>>>
>>> struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
>>> WRITE_ONCE(m->metrics[metric - 1], val);
>> Thanks, I will update this in next version.
> BTW, do we really need to WRITE_ONCE here? What if the `val` are different
> on 2 CPUs? This would hide the problem, right?
>
> Thanks
> Hangbin
If concurrent writes to the same memory by two CPUs are considered a bug,
the solution is prevention, not detection by KCSAN. Furthermore, WRITE_ONCE
also helps suppress warnings from read-write races. For example, commit
6e84fc395e90 demonstrates this.
© 2016 - 2026 Red Hat, Inc.