fs/netfs/write_collect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
syzkaller reported a null-pointer dereference bug
(https://syzkaller.appspot.com/bug?extid=af5c06208fa71bf31b16) in
netfs_writeback_unlock_folios caused by passing a NULL folioq to
folioq_folio. Fix by adding a check before entering the loop.
Signed-off-by: Chang Yu <marcus.yu.56@gmail.com>
Reported-by: syzbot+af5c06208fa71bf31b16@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=af5c06208fa71bf31b16
Fixes: cd0277ed0c18 ("netfs: Use new folio_queue data type and iterator instead of xarray iter")
---
fs/netfs/write_collect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 1d438be2e1b4..23d46a409ff2 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -98,7 +98,7 @@ static void netfs_writeback_unlock_folios(struct netfs_io_request *wreq,
slot = 0;
}
- for (;;) {
+ while (folioq) {
struct folio *folio;
struct netfs_folio *finfo;
unsigned long long fpos, fend;
--
2.47.0
Chang Yu <marcus.yu.56@gmail.com> wrote: > syzkaller reported a null-pointer dereference bug > (https://syzkaller.appspot.com/bug?extid=af5c06208fa71bf31b16) in > netfs_writeback_unlock_folios caused by passing a NULL folioq to > folioq_folio. Fix by adding a check before entering the loop. And, of course, the preceding: if (slot >= folioq_nr_slots(folioq)) { doesn't oops because it doesn't actually dereference folioq. However... if we get into this function, there absolutely *should* be at least one folioq in the rolling buffer. Part of the rolling buffer's method of operation involves keeping at least one folioq around at all times so that we don't need to use locks to add/remove from the queue. Either the rolling buffer wasn't initialised yet (and it should be initialised for all write requests by netfs_create_write_req()) or it has been destroyed already. Either way, your patch is, unfortunately, just covering up the symptoms rather than fixing the root cause. I suggest instead that we patch the function to detect the empty rolling buffer up front, dump some information about the bad request and return. David
On Fri, Oct 25, 2024 at 09:05:53AM +0100, David Howells wrote: > Chang Yu <marcus.yu.56@gmail.com> wrote: > > > syzkaller reported a null-pointer dereference bug > > (https://syzkaller.appspot.com/bug?extid=af5c06208fa71bf31b16) in > > netfs_writeback_unlock_folios caused by passing a NULL folioq to > > folioq_folio. Fix by adding a check before entering the loop. > > And, of course, the preceding: > > if (slot >= folioq_nr_slots(folioq)) { > > doesn't oops because it doesn't actually dereference folioq. > > However... if we get into this function, there absolutely *should* be at least > one folioq in the rolling buffer. Part of the rolling buffer's method of > operation involves keeping at least one folioq around at all times so that we > don't need to use locks to add/remove from the queue. > > Either the rolling buffer wasn't initialised yet (and it should be initialised > for all write requests by netfs_create_write_req()) or it has been destroyed > already. > > Either way, your patch is, unfortunately, just covering up the symptoms rather > than fixing the root cause. I suggest instead that we patch the function to > detect the empty rolling buffer up front, dump some information about the bad > request and return. > > David > I see. This might be a stupid question, but is it ever possible that we have exactly one folioq and at the same time slot >= folioq_nr_slots(folioq) is true? Then I imagine netfs_delete_buffer_head would return NULL and cause the bug to trigger as well?
Chang Yu <marcus.yu.56@gmail.com> wrote: > I see. This might be a stupid question, but is it ever possible that we have > exactly one folioq and at the same time > > slot >= folioq_nr_slots(folioq) > > is true? Then I imagine netfs_delete_buffer_head would return NULL and > cause the bug to trigger as well? Whilst it is possible for "slot >= folioq_nr_slots(folioq)" to be true on what is currently the last folioq, wreq->cleaned_to suggests that there must be still-locked folios in the queue: unsigned long long clean_to = min(wreq->collected_to, wreq->contiguity); if (wreq->cleaned_to < clean_to) netfs_writeback_unlock_folios(wreq, clean_to, ¬es); David
© 2016 - 2024 Red Hat, Inc.