fs/netfs/read_collect.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
When checking whether the edges of adjacent subrequests touch, the
`prev` variable is deferenced, but it might not have been initialized.
This causes crashes like this one:
BUG: unable to handle page fault for address: 0000000181343843
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 8000001c66db0067 P4D 8000001c66db0067 PUD 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 1 UID: 33333 PID: 24424 Comm: php-cgi8.2 Kdump: loaded Not tainted 6.13.2-cm4all0-hp+ #427
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 11/23/2021
RIP: 0010:netfs_consume_read_data.isra.0+0x5ef/0xb00
Code: fe ff ff 48 8b 83 88 00 00 00 48 8b 4c 24 30 4c 8b 43 78 48 85 c0 48 8d 51 70 75 20 48 8b 73 30 48 39 d6 74 17 48 8b 7c 24 40 <48> 8b 4f 78 48 03 4f 68 48 39 4b 68 0f 84 ab 02 00 00 49 29 c0 48
RSP: 0000:ffffc90037adbd00 EFLAGS: 00010283
RAX: 0000000000000000 RBX: ffff88811bda0600 RCX: ffff888620e7b980
RDX: ffff888620e7b9f0 RSI: ffff88811bda0428 RDI: 00000001813437cb
RBP: 0000000000000000 R08: 0000000000004000 R09: 0000000000000000
R10: ffffffff82e070c0 R11: 0000000007ffffff R12: 0000000000004000
R13: ffff888620e7bb68 R14: 0000000000008000 R15: ffff888620e7bb68
FS: 00007ff2e0e7ddc0(0000) GS:ffff88981f840000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000181343843 CR3: 0000001bc10ba006 CR4: 00000000001706f0
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x15c/0x450
? search_extable+0x22/0x30
? netfs_consume_read_data.isra.0+0x5ef/0xb00
? search_module_extables+0xe/0x40
? exc_page_fault+0x5e/0x100
? asm_exc_page_fault+0x22/0x30
? netfs_consume_read_data.isra.0+0x5ef/0xb00
? intel_iommu_unmap_pages+0xaa/0x190
? __pfx_cachefiles_read_complete+0x10/0x10
netfs_read_subreq_terminated+0x24f/0x390
cachefiles_read_complete+0x48/0xf0
iomap_dio_bio_end_io+0x125/0x160
blk_update_request+0xea/0x3e0
scsi_end_request+0x27/0x190
scsi_io_completion+0x43/0x6c0
blk_complete_reqs+0x40/0x50
handle_softirqs+0xd1/0x280
irq_exit_rcu+0x91/0xb0
common_interrupt+0x3b/0xa0
asm_common_interrupt+0x22/0x40
RIP: 0033:0x55fe8470d2ab
Code: 00 00 3c 7e 74 3b 3c b6 0f 84 dd 03 00 00 3c 1e 74 2f 83 c1 01 48 83 c2 38 48 83 c7 30 44 39 d1 74 3e 48 63 42 08 85 c0 79 a3 <49> 8b 46 48 8b 04 38 f6 c4 04 75 0b 0f b6 42 30 83 e0 0c 3c 04 75
RSP: 002b:00007ffca5ef2720 EFLAGS: 00000216
RAX: 0000000000000023 RBX: 0000000000000008 RCX: 000000000000001b
RDX: 00007ff2e0cdb6f8 RSI: 0000000000000006 RDI: 0000000000000510
RBP: 00007ffca5ef27a0 R08: 00007ffca5ef2720 R09: 0000000000000001
R10: 000000000000001e R11: 00007ff2e0c10d08 R12: 0000000000000001
R13: 0000000000000120 R14: 00007ff2e0cb1ed0 R15: 00000000000000b0
</TASK>
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
David/Greg: just like the other two netfs patches I sent yesterday,
this one doesn't apply to v6.14 as it was obsoleted by commit
e2d46f2ec332 ("netfs: Change the read result collector to only use one
work item").
---
fs/netfs/read_collect.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index e8624f5c7fcc..a7f285c52a79 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -258,17 +258,19 @@ static bool netfs_consume_read_data(struct netfs_io_subrequest *subreq, bool was
*/
if (!subreq->consumed &&
!prev_donated &&
- !list_is_first(&subreq->rreq_link, &rreq->subrequests) &&
- subreq->start == prev->start + prev->len) {
+ !list_is_first(&subreq->rreq_link, &rreq->subrequests)) {
prev = list_prev_entry(subreq, rreq_link);
- WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len);
- subreq->start += subreq->len;
- subreq->len = 0;
- subreq->transferred = 0;
- trace_netfs_donate(rreq, subreq, prev, subreq->len,
- netfs_trace_donate_to_prev);
- trace_netfs_sreq(subreq, netfs_sreq_trace_donate_to_prev);
- goto remove_subreq_locked;
+ if (subreq->start == prev->start + prev->len) {
+ prev = list_prev_entry(subreq, rreq_link);
+ WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len);
+ subreq->start += subreq->len;
+ subreq->len = 0;
+ subreq->transferred = 0;
+ trace_netfs_donate(rreq, subreq, prev, subreq->len,
+ netfs_trace_donate_to_prev);
+ trace_netfs_sreq(subreq, netfs_sreq_trace_donate_to_prev);
+ goto remove_subreq_locked;
+ }
}
/* If we can't donate down the chain, donate up the chain instead. */
--
2.47.2
Max Kellermann <max.kellermann@ionos.com> wrote:
> prev = list_prev_entry(subreq, rreq_link);
> ...
> + if (subreq->start == prev->start + prev->len) {
> + prev = list_prev_entry(subreq, rreq_link);
Actually, that doubles the setting of prev redundantly. It shouldn't hurt,
but we might want to remove the inner one.
David
On Fri, Feb 14, 2025 at 2:03 PM David Howells <dhowells@redhat.com> wrote:
>
> Max Kellermann <max.kellermann@ionos.com> wrote:
>
> > prev = list_prev_entry(subreq, rreq_link);
> > ...
> > + if (subreq->start == prev->start + prev->len) {
> > + prev = list_prev_entry(subreq, rreq_link);
>
> Actually, that doubles the setting of prev redundantly. It shouldn't hurt,
> but we might want to remove the inner one.
Oh right, that line shouldn't exist twice. (Previously, it was set too
late, after the variable had already been dereferenced.) I'll send v2
with only one copy.
Max Kellermann <max.kellermann@ionos.com> wrote:
> When checking whether the edges of adjacent subrequests touch, the
> `prev` variable is deferenced, but it might not have been initialized.
> This causes crashes like this one:
>
> BUG: unable to handle page fault for address: 0000000181343843
> ...
>
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Cc: stable@vger.kernel.org
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
On Fri, Feb 14, 2025 at 1:53 PM David Howells <dhowells@redhat.com> wrote: > Signed-off-by: David Howells <dhowells@redhat.com> Thanks David. By the way, we have been running 6.13.2 with these 3 patches on dozens of production servers since I submitted them. All netfs problems that had been haunting us since 6.10 are gone. No more crashes, for the first time since last summer.
© 2016 - 2025 Red Hat, Inc.