From: Guo Ren <guoren@linux.alibaba.com>
Don't pass nr_bits as arg1, cpu_max_bits_warn would cause warning
now 854701ba4c39 ("net: fix cpu_max_bits_warn() usage in
netif_attrmask_next{,_and}").
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1 at include/linux/cpumask.h:110 __netif_set_xps_queue+0x14e/0x770
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc4-00018-g854701ba4c39 #324
Hardware name: riscv-virtio,qemu (DT)
epc : __netif_set_xps_queue+0x14e/0x770
ra : __netif_set_xps_queue+0x552/0x770
epc : ffffffff806fe448 ra : ffffffff806fe84c sp : ff600000023279d0
gp : ffffffff815fff88 tp : ff600000023a0000 t0 : ff6000000308ab40
t1 : 0000000000000003 t2 : 0000000000000000 s0 : ff60000002327a90
s1 : 0000000000000000 a0 : ff6000000308ab00 a1 : ff6000000308ab00
a2 : ff6000000308a8e8 a3 : 0000000000000004 a4 : 0000000000000000
a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000
s2 : 0000000000000000 s3 : 0000000000000000 s4 : ff60000002327aa0
s5 : ffffffff816031c8 s6 : 0000000000000000 s7 : 0000000000000001
s8 : 0000000000000000 s9 : 0000000000000004 s10: ff6000000308a8c0
s11: 0000000000000004 t3 : 0000000000000000 t4 : 0000000000000014
t5 : 0000000000000000 t6 : 0000000000000000
status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff805e5824>] virtnet_set_affinity+0x14a/0x1c0
[<ffffffff805e7b04>] virtnet_probe+0x7fc/0xee2
[<ffffffff8050e120>] virtio_dev_probe+0x164/0x2de
[<ffffffff8055b69e>] really_probe+0x82/0x224
[<ffffffff8055b89a>] __driver_probe_device+0x5a/0xaa
[<ffffffff8055b916>] driver_probe_device+0x2c/0xb8
[<ffffffff8055bf34>] __driver_attach+0x76/0x108
[<ffffffff805597c0>] bus_for_each_dev+0x4a/0x8e
[<ffffffff8055b072>] driver_attach+0x1a/0x28
[<ffffffff8055ab8c>] bus_add_driver+0x13c/0x1a6
[<ffffffff8055c722>] driver_register+0x4a/0xfc
[<ffffffff8050dc34>] register_virtio_driver+0x1c/0x2c
[<ffffffff80a2bae4>] virtio_net_driver_init+0x7a/0xb0
[<ffffffff80002840>] do_one_initcall+0x66/0x2e4
[<ffffffff80a01212>] kernel_init_freeable+0x28a/0x304
[<ffffffff808b21e2>] kernel_init+0x1e/0x110
[<ffffffff80003c46>] ret_from_exception+0x0/0x10
---[ end trace 0000000000000000 ]---
Fixes: 80d19669ecd3 ("net: Refactor XPS for CPUs and Rx queues")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
net/core/dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index fa53830d0683..9ec8b10ae329 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2589,8 +2589,8 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
copy = true;
/* allocate memory for queue storage */
- for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
- j < nr_ids;) {
+ for (j = -1; j < nr_ids;
+ j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
if (!new_dev_maps) {
new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
if (!new_dev_maps) {
--
2.36.1
On Thu, Oct 13, 2022 at 11:04:58PM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Don't pass nr_bits as arg1, cpu_max_bits_warn would cause warning
> now 854701ba4c39 ("net: fix cpu_max_bits_warn() usage in
> netif_attrmask_next{,_and}").
>
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 1 at include/linux/cpumask.h:110 __netif_set_xps_queue+0x14e/0x770
> Modules linked in:
Submitting Patches documentation suggests to cut this to only what makes sense
for the report.
--
With Best Regards,
Andy Shevchenko
On Fri, Oct 14, 2022 at 6:01 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Thu, Oct 13, 2022 at 11:04:58PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Don't pass nr_bits as arg1, cpu_max_bits_warn would cause warning
> > now 854701ba4c39 ("net: fix cpu_max_bits_warn() usage in
> > netif_attrmask_next{,_and}").
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 2 PID: 1 at include/linux/cpumask.h:110 __netif_set_xps_queue+0x14e/0x770
> > Modules linked in:
>
> Submitting Patches documentation suggests to cut this to only what makes sense
> for the report.
Right, thx for mentioning.
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
--
Best Regards
Guo Ren
On Thu, 13 Oct 2022 23:04:58 -0400 guoren@kernel.org wrote:
> - for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
> - j < nr_ids;) {
> + for (j = -1; j < nr_ids;
> + j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
This does not look equivalent, have you tested it?
nr_ids is unsigned, doesn't it mean we'll never enter the loop?
Can we instead revert 854701ba4c and take the larger rework Yury
has posted a week ago into net-next?
On Fri, Oct 14, 2022 at 11:35 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 13 Oct 2022 23:04:58 -0400 guoren@kernel.org wrote:
> > - for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
> > - j < nr_ids;) {
> > + for (j = -1; j < nr_ids;
> > + j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
>
> This does not look equivalent, have you tested it?
>
> nr_ids is unsigned, doesn't it mean we'll never enter the loop?
Yes, you are right. Any unsigned int would break the result.
(gdb) p (int)-1 < (int)2
$1 = 1
(gdb) p (int)-1 < (unsigned int)2
$2 = 0
(gdb) p (unsigned int)-1 < (int)2
$4 = 0
So it should be:
- for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
- j < nr_ids;) {
+ for (j = -1; j < (int)nr_ids;
+ j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
Right? Of cause, nr_ids couldn't be 0xffffffff (-1).
>
> Can we instead revert 854701ba4c and take the larger rework Yury
> has posted a week ago into net-next?
--
Best Regards
Guo Ren
On Fri, 14 Oct 2022 14:38:56 +0800 Guo Ren wrote:
> > This does not look equivalent, have you tested it?
> >
> > nr_ids is unsigned, doesn't it mean we'll never enter the loop?
>
> Yes, you are right. Any unsigned int would break the result.
> (gdb) p (int)-1 < (int)2
> $1 = 1
> (gdb) p (int)-1 < (unsigned int)2
> $2 = 0
> (gdb) p (unsigned int)-1 < (int)2
> $4 = 0
>
> So it should be:
> - for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
> - j < nr_ids;) {
> + for (j = -1; j < (int)nr_ids;
> + j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
>
> Right? Of cause, nr_ids couldn't be 0xffffffff (-1).
No. You can't enter the loop with -1 as the iterator either.
Let's move on.
On Fri, Oct 14, 2022 at 11:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 14 Oct 2022 14:38:56 +0800 Guo Ren wrote:
> > > This does not look equivalent, have you tested it?
> > >
> > > nr_ids is unsigned, doesn't it mean we'll never enter the loop?
> >
> > Yes, you are right. Any unsigned int would break the result.
> > (gdb) p (int)-1 < (int)2
> > $1 = 1
> > (gdb) p (int)-1 < (unsigned int)2
> > $2 = 0
> > (gdb) p (unsigned int)-1 < (int)2
> > $4 = 0
> >
> > So it should be:
> > - for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids),
> > - j < nr_ids;) {
> > + for (j = -1; j < (int)nr_ids;
> > + j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
> >
> > Right? Of cause, nr_ids couldn't be 0xffffffff (-1).
>
> No. You can't enter the loop with -1 as the iterator either.
> Let's move on.
Oops, how about the below:
for (j = netif_attrmask_next_and(-1, online_mask, mask, nr_ids);
j < (int)nr_ids;
j = netif_attrmask_next_and(j, online_mask, mask, nr_ids)) {
--
Best Regards
Guo Ren
On Thu, 13 Oct 2022 20:35:44 -0700 Jakub Kicinski wrote: > Can we instead revert 854701ba4c and take the larger rework Yury > has posted a week ago into net-next? Oh, it was reposted today: https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ But we need a revert of 854701ba4c as well to cover the issue back up for 6.1, AFAIU.
On Thu, Oct 13, 2022 at 08:39:11PM -0700, Jakub Kicinski wrote: > On Thu, 13 Oct 2022 20:35:44 -0700 Jakub Kicinski wrote: > > Can we instead revert 854701ba4c and take the larger rework Yury > > has posted a week ago into net-next? > > Oh, it was reposted today: > > https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ > > But we need a revert of 854701ba4c as well to cover the issue back up > for 6.1, AFAIU. The patch 854701ba4c is technically correct. I fixed most of warnings in advance, but nobody can foresee everything, right? I expected some noise, and now we have just a few things to fix. This is what for -rc releases exist, didn't they? I suggest to keep the patch, because this is the only way to make cpumask_check()-related issues visible to people. If things will go as they go now, I expect that -rc3 will be clean from cpumask_check() warnings. Thanks, Yury
On Thu, 13 Oct 2022 21:42:41 -0700 Yury Norov wrote: > > Oh, it was reposted today: > > > > https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ > > > > But we need a revert of 854701ba4c as well to cover the issue back up > > for 6.1, AFAIU. > > The patch 854701ba4c is technically correct. I fixed most of warnings in > advance, but nobody can foresee everything, right? I expected some noise, > and now we have just a few things to fix. I got 6 warnings booting my machine after pulling back from Linus (which included your patches in net for the first time). And that's not including the XPS and the virtio warning. > This is what for -rc releases exist, didn't they? > > I suggest to keep the patch, because this is the only way to make > cpumask_check()-related issues visible to people. If things will go as > they go now, I expect that -rc3 will be clean from cpumask_check() > warnings. This sounds too close to saying that "it's okay for -rc1 to be broken". Why were your changes not in linux-next for a month before the merge window? :( We will not be merging a refactoring series into net to silence an arguably over-eager warning. We need a minimal fix, Guo Ren's patches seem to miss the mark so I reckon the best use of everyone's time is to just drop the exposing patch and retry in -next 🤷
On Sat, Oct 15, 2022 at 12:03 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 13 Oct 2022 21:42:41 -0700 Yury Norov wrote: > > > Oh, it was reposted today: > > > > > > https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ > > > > > > But we need a revert of 854701ba4c as well to cover the issue back up > > > for 6.1, AFAIU. > > > > The patch 854701ba4c is technically correct. I fixed most of warnings in > > advance, but nobody can foresee everything, right? I expected some noise, > > and now we have just a few things to fix. > > I got 6 warnings booting my machine after pulling back from Linus > (which included your patches in net for the first time). > And that's not including the XPS and the virtio warning. Oh, that's a wide effect than we thought. > > > This is what for -rc releases exist, didn't they? > > > > I suggest to keep the patch, because this is the only way to make > > cpumask_check()-related issues visible to people. If things will go as > > they go now, I expect that -rc3 will be clean from cpumask_check() > > warnings. > > This sounds too close to saying that "it's okay for -rc1 to be broken". > Why were your changes not in linux-next for a month before the merge > window? :( > > We will not be merging a refactoring series into net to silence an > arguably over-eager warning. We need a minimal fix, Guo Ren's patches > seem to miss the mark so I reckon the best use of everyone's time is > to just drop the exposing patch and retry in -next 🤷 -- Best Regards Guo Ren
On Fri, Oct 14, 2022 at 9:03 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 13 Oct 2022 21:42:41 -0700 Yury Norov wrote: > > > Oh, it was reposted today: > > > > > > https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ > > > > > > But we need a revert of 854701ba4c as well to cover the issue back up > > > for 6.1, AFAIU. > > > > The patch 854701ba4c is technically correct. I fixed most of warnings in > > advance, but nobody can foresee everything, right? I expected some noise, > > and now we have just a few things to fix. > > I got 6 warnings booting my machine after pulling back from Linus > (which included your patches in net for the first time). > And that's not including the XPS and the virtio warning. > > > This is what for -rc releases exist, didn't they? > > > > I suggest to keep the patch, because this is the only way to make > > cpumask_check()-related issues visible to people. If things will go as > > they go now, I expect that -rc3 will be clean from cpumask_check() > > warnings. > > This sounds too close to saying that "it's okay for -rc1 to be broken". > Why were your changes not in linux-next for a month before the merge > window? :( They spent about a month in -next. Nobody cared. > We will not be merging a refactoring series into net to silence an > arguably over-eager warning. We need a minimal fix, Guo Ren's patches > seem to miss the mark so I reckon the best use of everyone's time is > to just drop the exposing patch and retry in -next 🤷 If you prefer treating symptoms rather than the disease - I have nothing to add.
On Fri, 14 Oct 2022 09:16:01 -0700 Yury Norov wrote: > > We will not be merging a refactoring series into net to silence an > > arguably over-eager warning. We need a minimal fix, Guo Ren's patches > > seem to miss the mark so I reckon the best use of everyone's time is > > to just drop the exposing patch and retry in -next 🤷 > > If you prefer treating symptoms rather than the disease - I have nothing > to add. I don't, but we may consider different things to be "the disease". Please do not insinuate that I don't care about fixing bugs. What I can grok from the history and your commit messages is that you want to catch people who pass what you consider invalid inputs to the helpers, but nothing will crash/OOB access here, because the helper double checks that the input is < nr_bits. So it's a nice cleanup and refactoring, sure, but not an urgent fix that needs to go to Linus ASAP. If that's not what you're fixing please explain, I believe I already asked you to clarify before. And the commit message aren't exactly informative either.
On Fri, Oct 14, 2022 at 12:42 PM Yury Norov <yury.norov@gmail.com> wrote: > > On Thu, Oct 13, 2022 at 08:39:11PM -0700, Jakub Kicinski wrote: > > On Thu, 13 Oct 2022 20:35:44 -0700 Jakub Kicinski wrote: > > > Can we instead revert 854701ba4c and take the larger rework Yury > > > has posted a week ago into net-next? > > > > Oh, it was reposted today: > > > > https://lore.kernel.org/all/20221013234349.1165689-2-yury.norov@gmail.com/ > > > > But we need a revert of 854701ba4c as well to cover the issue back up > > for 6.1, AFAIU. > > The patch 854701ba4c is technically correct. I fixed most of warnings in > advance, but nobody can foresee everything, right? I expected some noise, > and now we have just a few things to fix. This is what for -rc releases > exist, didn't they? Your job is great, I just want to help with some fixes. Fixes them in -rc would be a good point. > > I suggest to keep the patch, because this is the only way to make > cpumask_check()-related issues visible to people. If things will go as > they go now, I expect that -rc3 will be clean from cpumask_check() > warnings. > > Thanks, > Yury -- Best Regards Guo Ren
© 2016 - 2026 Red Hat, Inc.