fs/netfs/read_pgpriv2.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
At the beginning of the function, folio queues with marks3==0 are
skipped, but after that, the `marks3` field is ignored. If one such
queue is found, `slot` is set to 64 (because `__ffs(0)==64`), leading
to a buffer overflow in the folioq_folio() call. The resulting crash
may look like this:
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 11 UID: 0 PID: 2909 Comm: kworker/u262:1 Not tainted 6.13.1-cm4all2-vm #415
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Workqueue: events_unbound netfs_read_termination_worker
RIP: 0010:netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
Code: 48 85 c0 48 89 44 24 08 0f 84 24 01 00 00 48 8b 80 40 01 00 00 48 8b 7c 24 08 f3 48 0f bc c0 89 44 24 18 89 c0 48 8b 74 c7 08 <48> 8b 06 48 c7 04 24 00 10 00 00 a8 40 74 10 0f b6 4e 40 b8 00 10
RSP: 0018:ffffbbc440effe18 EFLAGS: 00010203
RAX: 0000000000000040 RBX: ffff96f8fc034000 RCX: 0000000000000000
RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff96f8fc036400
RBP: 0000000000001000 R08: ffff96f9132bb400 R09: 0000000000001000
R10: ffff96f8c1263c80 R11: 0000000000000003 R12: 0000000000001000
R13: ffff96f8fb75ade8 R14: fffffaaf5ca90000 R15: ffff96f8fb75ad00
FS: 0000000000000000(0000) GS:ffff9703cf0c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010c9ca003 CR4: 00000000001706b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x158/0x450
? search_extable+0x22/0x30
? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
? search_module_extables+0xe/0x40
? exc_page_fault+0x62/0x120
? asm_exc_page_fault+0x22/0x30
? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
? netfs_pgpriv2_write_to_the_cache+0xf6/0x3f0
netfs_read_termination_worker+0x1f/0x60
process_one_work+0x138/0x2d0
worker_thread+0x2a5/0x3b0
? __pfx_worker_thread+0x10/0x10
kthread+0xba/0xe0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x30/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
Note this patch doesn't apply to v6.14 as it was obsoleted by commit
e2d46f2ec332 ("netfs: Change the read result collector to only use one
work item").
---
fs/netfs/read_pgpriv2.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 54d5004fec18..e72f5e674834 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -181,16 +181,17 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq)
break;
folioq_unmark3(folioq, slot);
- if (!folioq->marks3) {
+ while (!folioq->marks3) {
folioq = folioq->next;
if (!folioq)
- break;
+ goto end_of_queue;
}
slot = __ffs(folioq->marks3);
folio = folioq_folio(folioq, slot);
}
+end_of_queue:
netfs_issue_write(wreq, &wreq->io_streams[1]);
smp_wmb(); /* Write lists before ALL_QUEUED. */
set_bit(NETFS_RREQ_ALL_QUEUED, &wreq->flags);
--
2.47.2
Max Kellermann <max.kellermann@ionos.com> wrote:
> At the beginning of the function, folio queues with marks3==0 are
> skipped, but after that, the `marks3` field is ignored. If one such
> queue is found, `slot` is set to 64 (because `__ffs(0)==64`), leading
> to a buffer overflow in the folioq_folio() call. The resulting crash
> may look like this:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> ...
>
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Cc: stable@vger.kernel.org
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
On Mon, Feb 10, 2025 at 11:31:44PM +0100, Max Kellermann wrote:
> At the beginning of the function, folio queues with marks3==0 are
> skipped, but after that, the `marks3` field is ignored. If one such
> queue is found, `slot` is set to 64 (because `__ffs(0)==64`), leading
> to a buffer overflow in the folioq_folio() call. The resulting crash
> may look like this:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 11 UID: 0 PID: 2909 Comm: kworker/u262:1 Not tainted 6.13.1-cm4all2-vm #415
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> Workqueue: events_unbound netfs_read_termination_worker
> RIP: 0010:netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
> Code: 48 85 c0 48 89 44 24 08 0f 84 24 01 00 00 48 8b 80 40 01 00 00 48 8b 7c 24 08 f3 48 0f bc c0 89 44 24 18 89 c0 48 8b 74 c7 08 <48> 8b 06 48 c7 04 24 00 10 00 00 a8 40 74 10 0f b6 4e 40 b8 00 10
> RSP: 0018:ffffbbc440effe18 EFLAGS: 00010203
> RAX: 0000000000000040 RBX: ffff96f8fc034000 RCX: 0000000000000000
> RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff96f8fc036400
> RBP: 0000000000001000 R08: ffff96f9132bb400 R09: 0000000000001000
> R10: ffff96f8c1263c80 R11: 0000000000000003 R12: 0000000000001000
> R13: ffff96f8fb75ade8 R14: fffffaaf5ca90000 R15: ffff96f8fb75ad00
> FS: 0000000000000000(0000) GS:ffff9703cf0c0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000010c9ca003 CR4: 00000000001706b0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? __die+0x1f/0x60
> ? page_fault_oops+0x158/0x450
> ? search_extable+0x22/0x30
> ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
> ? search_module_extables+0xe/0x40
> ? exc_page_fault+0x62/0x120
> ? asm_exc_page_fault+0x22/0x30
> ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0
> ? netfs_pgpriv2_write_to_the_cache+0xf6/0x3f0
> netfs_read_termination_worker+0x1f/0x60
> process_one_work+0x138/0x2d0
> worker_thread+0x2a5/0x3b0
> ? __pfx_worker_thread+0x10/0x10
> kthread+0xba/0xe0
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x30/0x50
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Cc: stable@vger.kernel.org
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> Note this patch doesn't apply to v6.14 as it was obsoleted by commit
> e2d46f2ec332 ("netfs: Change the read result collector to only use one
> work item").
Why can't we just take what is upstream instead?
Diverging from that ALWAYS ends up being more work and problems in the
end. Only do so if you have no other choice.
thanks,
greg k-h
On Tue, Feb 11, 2025 at 7:30 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > Note this patch doesn't apply to v6.14 as it was obsoleted by commit
> > e2d46f2ec332 ("netfs: Change the read result collector to only use one
> > work item").
>
> Why can't we just take what is upstream instead?
>
> Diverging from that ALWAYS ends up being more work and problems in the
> end. Only do so if you have no other choice.
Usually I agree with that, and I trust that you will make the right decision.
Before you decide, let me point out that netfs has been extremely
unstable since 6.10 (July 2024), ever since commit 2e9d7e4b984a ("mm:
Remove the PG_fscache alias for PG_private_2). All of our web servers
have been crashing since 6.10 all the time (see
https://lore.kernel.org/netfs/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=zCy9Y87R1qw@mail.gmail.com/
for one of several bug reports I posted), and I went through
considerable trouble by resisting the pressure from people asking me
to downgrade to 6.6. (I want the bugs fixed, I don't want to go back.)
For several months, 6.12 had been crashing instantly on boot due to
yet another netfs regression
(https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/)
which wasn't fixed until 6.12.11, so our production servers are still
on 6.11.11 today.
Before these bugs got ironed out, v6.11 commit ee4cdf7ba857 ("netfs:
Speed up buffered reading") wreaked more havoc, leading to 8 "Fixes"
commits so far, plus the 2 I posted yesterday.
I wrote that e2d46f2ec332 has obsoleted my patch (actually both
patches), but I don't know if it really fixes both bugs. The code that
was buggy does not exist anymore in the form that my patch addresses,
but I don't know if it was just refactored and the bugs were kept (or
maybe yet more bugs sneaked in).
Max
On Tue, Feb 11, 2025 at 09:05:44AM +0100, Max Kellermann wrote:
> On Tue, Feb 11, 2025 at 7:30 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> > > Note this patch doesn't apply to v6.14 as it was obsoleted by commit
> > > e2d46f2ec332 ("netfs: Change the read result collector to only use one
> > > work item").
> >
> > Why can't we just take what is upstream instead?
> >
> > Diverging from that ALWAYS ends up being more work and problems in the
> > end. Only do so if you have no other choice.
>
> Usually I agree with that, and I trust that you will make the right decision.
I defer to the maintainers of the subsystem here to make that decision,
as they are the ones that will be getting my "FAILED" backport emails
when things fail to backport due to this being out-of-tree :)
thanks,
greg k-h
© 2016 - 2026 Red Hat, Inc.