drivers/scsi/scsi_scan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Filesystems can write to disk from page reclaim with filesystem
locks held, if __GFP_FS is set. Marc found a case where
scsi_realloc_sdev_budget_map ends up in page reclaim with GFP_KERNEL,
where it could try to take filesystem locks again, leading to a deadlock.
WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0
but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760
The full lockdep splat can be found in Marc's report:
https://lkml.org/lkml/2025/1/24/1101
Avoid the potential deadlock by doing the allocation with GFP_NOFS.
Reported-by: Marc Aurèle La France <tsi@tuyoix.net>
Signed-off-by: Rik van Riel <riel@surriel.com>
---
drivers/scsi/scsi_scan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2093982b3db..93d6feef9c7c 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -245,7 +245,7 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
}
ret = sbitmap_init_node(&sdev->budget_map,
scsi_device_max_queue_depth(sdev),
- new_shift, GFP_KERNEL,
+ new_shift, GFP_NOFS,
sdev->request_queue->node, false, true);
if (!ret)
sbitmap_resize(&sdev->budget_map, depth);
--
2.47.1
GFP_NOFS is never the right thing for block layer allocations. The right thing here is GFP_NOIO which is a superset of GFP_NOFS. Otherwise you could reproduce the same deadlock when using swap instead of a file system to reproduce basically the same deadlock. Note that this: https://lore.kernel.org/linux-block/20250117074442.256705-3-hch@lst.de/T/#u should probably fix the actual deadlock, but it might still need annotations for lockdep to deal with the initial probing where the queue is not frozen. Compared to hacky annotations just using GFP_NOIO feels simpler and more obvious.
On Tue, 28 Jan 2025 21:35:18 -0800
Christoph Hellwig <hch@infradead.org> wrote:
> GFP_NOFS is never the right thing for block layer allocations.
> The right thing here is GFP_NOIO which is a superset of GFP_NOFS.
> Otherwise you could reproduce the same deadlock when using swap
> instead of a file system to reproduce basically the same deadlock.
Duh, you are right of course!
The fixed up patch with GFP_NOIO is below.
---8<---
From 74272b4537415fd7d94c216e422510c27aa88fa0 Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@surriel.com>
Date: Tue, 28 Jan 2025 16:35:39 -0500
Subject: [PATCH] scsi: use GFP_NOIO to avoid circular locking dependency
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Filesystems can write to disk from page reclaim with __GFP_FS
set. Marc found a case where scsi_realloc_sdev_budget_map
ends up in page reclaim with GFP_KERNEL, where it could try
to take filesystem locks again, leading to a deadlock.
WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0
but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760
The full lockdep splat can be found in Marc's report:
https://lkml.org/lkml/2025/1/24/1101
Avoid the potential deadlock by doing the allocation with GFP_NOIO,
which prevents both filesystem and block layer recursion.
Reported-by: Marc Aurèle La France <tsi@tuyoix.net>
Signed-off-by: Rik van Riel <riel@surriel.com>
---
drivers/scsi/scsi_scan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2093982b3db..b0964b6dd646 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -245,7 +245,7 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
}
ret = sbitmap_init_node(&sdev->budget_map,
scsi_device_max_queue_depth(sdev),
- new_shift, GFP_KERNEL,
+ new_shift, GFP_NOIO,
sdev->request_queue->node, false, true);
if (!ret)
sbitmap_resize(&sdev->budget_map, depth);
--
2.47.1
On Wed, 29 Jan 2025 10:45:25 -0500, Rik van Riel wrote:
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Filesystems can write to disk from page reclaim with __GFP_FS
> set. Marc found a case where scsi_realloc_sdev_budget_map
> ends up in page reclaim with GFP_KERNEL, where it could try
> to take filesystem locks again, leading to a deadlock.
>
> [...]
Applied to 6.14/scsi-fixes, thanks!
[1/1] scsi: use GFP_NOIO to avoid circular locking dependency
https://git.kernel.org/mkp/scsi/c/5363ee9d110e
--
Martin K. Petersen Oracle Linux Engineering
On Wed, Jan 29, 2025 at 10:45:25AM -0500, Rik van Riel wrote: > On Tue, 28 Jan 2025 21:35:18 -0800 > Christoph Hellwig <hch@infradead.org> wrote: > > > GFP_NOFS is never the right thing for block layer allocations. > > The right thing here is GFP_NOIO which is a superset of GFP_NOFS. > > Otherwise you could reproduce the same deadlock when using swap > > instead of a file system to reproduce basically the same deadlock. > > Duh, you are right of course! > > The fixed up patch with GFP_NOIO is below. Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
© 2016 - 2026 Red Hat, Inc.