drivers/net/wireguard/queueing.h | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
From: Yury Norov [NVIDIA] <yury.norov@gmail.com>
wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the
function significantly simpler. While there, fix opencoded cpu_online()
too.
Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/
v2:
- fix 'cpu' undeclared;
- change subject (Jason);
- keep the original function structure (Jason);
drivers/net/wireguard/queueing.h | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
index 7eb76724b3ed..56314f98b6ba 100644
--- a/drivers/net/wireguard/queueing.h
+++ b/drivers/net/wireguard/queueing.h
@@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating)
static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
{
- unsigned int cpu = *stored_cpu, cpu_index, i;
+ unsigned int cpu = *stored_cpu;
+
+ if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu)))
+ cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask);
- if (unlikely(cpu >= nr_cpu_ids ||
- !cpumask_test_cpu(cpu, cpu_online_mask))) {
- cpu_index = id % cpumask_weight(cpu_online_mask);
- cpu = cpumask_first(cpu_online_mask);
- for (i = 0; i < cpu_index; ++i)
- cpu = cpumask_next(cpu, cpu_online_mask);
- *stored_cpu = cpu;
- }
return cpu;
}
--
2.43.0
On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > From: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > function significantly simpler. While there, fix opencoded cpu_online() > too. > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > --- > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/ > v2: > - fix 'cpu' undeclared; > - change subject (Jason); > - keep the original function structure (Jason); > > drivers/net/wireguard/queueing.h | 13 ++++--------- > 1 file changed, 4 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > index 7eb76724b3ed..56314f98b6ba 100644 > --- a/drivers/net/wireguard/queueing.h > +++ b/drivers/net/wireguard/queueing.h > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > { > - unsigned int cpu = *stored_cpu, cpu_index, i; > + unsigned int cpu = *stored_cpu; > + > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); I was about to apply this but then it occurred to me: what happens if cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? cpumask_nth() will then return nr_cpu_ids? Jason
On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > From: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > function significantly simpler. While there, fix opencoded cpu_online() > > too. > > > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > --- > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/ > > v2: > > - fix 'cpu' undeclared; > > - change subject (Jason); > > - keep the original function structure (Jason); > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > index 7eb76724b3ed..56314f98b6ba 100644 > > --- a/drivers/net/wireguard/queueing.h > > +++ b/drivers/net/wireguard/queueing.h > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > { > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > + unsigned int cpu = *stored_cpu; > > + > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > I was about to apply this but then it occurred to me: what happens if > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > cpumask_nth() will then return nr_cpu_ids? It will return >= nd_cpu_ids. The original version based a for-loop does the same, so I decided that the caller is safe against it. If not, I can send a v3. But, what should we do - retry, or return a local cpu? Or something else? Thanks, Yury
On Mon, Jun 30, 2025 at 01:33:37PM -0400, Yury Norov wrote: > On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > > From: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > function significantly simpler. While there, fix opencoded cpu_online() > > > too. > > > > > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > > --- > > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/ > > > v2: > > > - fix 'cpu' undeclared; > > > - change subject (Jason); > > > - keep the original function structure (Jason); > > > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > --- a/drivers/net/wireguard/queueing.h > > > +++ b/drivers/net/wireguard/queueing.h > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > { > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > + unsigned int cpu = *stored_cpu; > > > + > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > I was about to apply this but then it occurred to me: what happens if > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > cpumask_nth() will then return nr_cpu_ids? > > It will return >= nd_cpu_ids. The original version based a for-loop > does the same, so I decided that the caller is safe against it. Good point. I just checked... This goes into queue_work_on() which eventually hits: /* pwq which will be used unless @work is executing elsewhere */ if (req_cpu == WORK_CPU_UNBOUND) { And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess that's a fine failure mode. I'll queue this patch up. Jason
On Mon, Jun 30, 2025 at 07:38:02PM +0200, Jason A. Donenfeld wrote: > On Mon, Jun 30, 2025 at 01:33:37PM -0400, Yury Norov wrote: > > On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > > > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > > > From: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > > > > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > > function significantly simpler. While there, fix opencoded cpu_online() > > > > too. > > > > > > > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > > > --- > > > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/ > > > > v2: > > > > - fix 'cpu' undeclared; > > > > - change subject (Jason); > > > > - keep the original function structure (Jason); > > > > > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > --- a/drivers/net/wireguard/queueing.h > > > > +++ b/drivers/net/wireguard/queueing.h > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > { > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > + unsigned int cpu = *stored_cpu; > > > > + > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > I was about to apply this but then it occurred to me: what happens if > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > cpumask_nth() will then return nr_cpu_ids? > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > does the same, so I decided that the caller is safe against it. > > Good point. I just checked... This goes into queue_work_on() which > eventually hits: > > /* pwq which will be used unless @work is executing elsewhere */ > if (req_cpu == WORK_CPU_UNBOUND) { > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > that's a fine failure mode. Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of small_cpumask_nbits optimization. So it's safer to relax the condition. Can you consider applying the following patch for that? Thanks, Yury From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 From: Yury Norov (NVIDIA) <yury.norov@gmail.com> Date: Mon, 30 Jun 2025 13:47:49 -0400 Subject: [PATCH] workqueue: relax condition in __queue_work() Some cpumask search functions may return a number greater than nr_cpu_ids when nothing is found. Adjust __queue_work() to it. Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 9f9148075828..abacfe157fe6 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, rcu_read_lock(); retry: /* pwq which will be used unless @work is executing elsewhere */ - if (req_cpu == WORK_CPU_UNBOUND) { + if (req_cpu >= WORK_CPU_UNBOUND) { if (wq->flags & WQ_UNBOUND) cpu = wq_select_unbound_cpu(raw_smp_processor_id()); else -- 2.43.0
Hi Yury, > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > > --- a/drivers/net/wireguard/queueing.h > > > > > +++ b/drivers/net/wireguard/queueing.h > > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > > { > > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > > + unsigned int cpu = *stored_cpu; > > > > > + > > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > > > I was about to apply this but then it occurred to me: what happens if > > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > > cpumask_nth() will then return nr_cpu_ids? > > > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > > does the same, so I decided that the caller is safe against it. > > > > Good point. I just checked... This goes into queue_work_on() which > > eventually hits: > > > > /* pwq which will be used unless @work is executing elsewhere */ > > if (req_cpu == WORK_CPU_UNBOUND) { > > > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > > that's a fine failure mode. > > Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of > small_cpumask_nbits optimization. So it's safer to relax the > condition. > > Can you consider applying the following patch for that? > > Thanks, > Yury > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > From: Yury Norov (NVIDIA) <yury.norov@gmail.com> > Date: Mon, 30 Jun 2025 13:47:49 -0400 > Subject: [PATCH] workqueue: relax condition in __queue_work() > > Some cpumask search functions may return a number greater than > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> > --- > kernel/workqueue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index 9f9148075828..abacfe157fe6 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > rcu_read_lock(); > retry: > /* pwq which will be used unless @work is executing elsewhere */ > - if (req_cpu == WORK_CPU_UNBOUND) { > + if (req_cpu >= WORK_CPU_UNBOUND) { > if (wq->flags & WQ_UNBOUND) > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > else > Seems reasonable to me... Maybe submit this to Tejun and CC me? Jason
On Mon, Jun 30, 2025 at 07:55:49PM +0200, Jason A. Donenfeld wrote: > Hi Yury, > > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > > > --- a/drivers/net/wireguard/queueing.h > > > > > > +++ b/drivers/net/wireguard/queueing.h > > > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > > > { > > > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > > > + unsigned int cpu = *stored_cpu; > > > > > > + > > > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > > > > > I was about to apply this but then it occurred to me: what happens if > > > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > > > cpumask_nth() will then return nr_cpu_ids? > > > > > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > > > does the same, so I decided that the caller is safe against it. > > > > > > Good point. I just checked... This goes into queue_work_on() which > > > eventually hits: > > > > > > /* pwq which will be used unless @work is executing elsewhere */ > > > if (req_cpu == WORK_CPU_UNBOUND) { > > > > > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > > > that's a fine failure mode. > > > > Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of > > small_cpumask_nbits optimization. So it's safer to relax the > > condition. > > > > Can you consider applying the following patch for that? > > > > Thanks, > > Yury > > > > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > From: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > Some cpumask search functions may return a number greater than > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > --- > > kernel/workqueue.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > index 9f9148075828..abacfe157fe6 100644 > > --- a/kernel/workqueue.c > > +++ b/kernel/workqueue.c > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > rcu_read_lock(); > > retry: > > /* pwq which will be used unless @work is executing elsewhere */ > > - if (req_cpu == WORK_CPU_UNBOUND) { > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > if (wq->flags & WQ_UNBOUND) > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > else > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? Sure, no problem.
> > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > > From: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > > > Some cpumask search functions may return a number greater than > > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > > > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > > --- > > > kernel/workqueue.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > > index 9f9148075828..abacfe157fe6 100644 > > > --- a/kernel/workqueue.c > > > +++ b/kernel/workqueue.c > > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > > rcu_read_lock(); > > > retry: > > > /* pwq which will be used unless @work is executing elsewhere */ > > > - if (req_cpu == WORK_CPU_UNBOUND) { > > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > > if (wq->flags & WQ_UNBOUND) > > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > > else > > > > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? > > Sure, no problem. Hmm... So, actually WORK_CPU_UNBOUND is NR_CPUS, which is not the same as nr_cpu_ids. For example, on my Ubuntu machine, the CONFIG_NR_CPUS is 8192, and nr_cpu_ids is 8. So, for the wg_cpumask_next_online() to work properly, we need to return the WORK_CPU_UNBOUND in case of nothing is found. I think I need to send a v3... Thanks, Yury
On Mon, Jun 30, 2025 at 8:15 PM Yury Norov <yury.norov@gmail.com> wrote: > > > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > > > From: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > > > > > Some cpumask search functions may return a number greater than > > > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > > > > > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> > > > > --- > > > > kernel/workqueue.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > > > index 9f9148075828..abacfe157fe6 100644 > > > > --- a/kernel/workqueue.c > > > > +++ b/kernel/workqueue.c > > > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > > > rcu_read_lock(); > > > > retry: > > > > /* pwq which will be used unless @work is executing elsewhere */ > > > > - if (req_cpu == WORK_CPU_UNBOUND) { > > > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > > > if (wq->flags & WQ_UNBOUND) > > > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > > > else > > > > > > > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? > > > > Sure, no problem. > > Hmm... So, actually WORK_CPU_UNBOUND is NR_CPUS, which is not the same > as nr_cpu_ids. For example, on my Ubuntu machine, the CONFIG_NR_CPUS > is 8192, and nr_cpu_ids is 8. > > So, for the wg_cpumask_next_online() to work properly, we need to > return the WORK_CPU_UNBOUND in case of nothing is found. Or just try again? Could just make your if into a while.
On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > From: Yury Norov [NVIDIA] <yury.norov@gmail.com> > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > function significantly simpler. While there, fix opencoded cpu_online() > too. > > Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> > --- > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov@gmail.com/ > v2: > - fix 'cpu' undeclared; > - change subject (Jason); > - keep the original function structure (Jason); Reviewed-by: Simon Horman <horms@kernel.org>
© 2016 - 2025 Red Hat, Inc.