drivers/nvme/host/core.c | 2 -- 1 file changed, 2 deletions(-)
When initializing an nvme request which is about to be send to the block
layer, we do not need to initialize its timeout. If it's left
uninitialized at 0 the block layer will use the request queue's timeout
in blk_add_timer (via nvme_start_request which is called from
nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
NVME_ADMIN_TIMEOUT when the request queues were created.
Because the io_timeout of the IO queues can actually be modified via
sysfs, the following situation can occur:
1) NVME_IO_TIMEOUT = 30 (default module parameter)
2) nvme1n1 is probed. IO queues default timeout is 30 s
3) manually change the IO timeout to 90 s
echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
4) nvme zns report-zones /dev/nvme1n1
This command issues IO commands with timeout 30 s instead of the
wanted 90 s which might be more suitable for this device.
This patch, therefore, improves the consistency of IO timeout usage.
However, there are still uses of NVME_IO_TIMEOUT which could be
inconsistent with what is set in the device's request_queue by the user.
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
---
drivers/nvme/host/core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f1f719351f3f2..3a6d74e6dae11 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
struct nvme_ns *ns = req->q->disk->private_data;
logging_enabled = ns->head->passthru_err_log_enabled;
- req->timeout = NVME_IO_TIMEOUT;
} else { /* no queuedata implies admin queue */
logging_enabled = nr->ctrl->passthru_err_log_enabled;
- req->timeout = NVME_ADMIN_TIMEOUT;
}
if (!logging_enabled)
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Christof Hellmis
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
On Tue, Dec 02, 2025 at 01:58:19PM +0000, Heyne, Maximilian wrote: > When initializing an nvme request which is about to be send to the block > layer, we do not need to initialize its timeout. If it's left > uninitialized at 0 the block layer will use the request queue's timeout > in blk_add_timer (via nvme_start_request which is called from > nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or > NVME_ADMIN_TIMEOUT when the request queues were created. > > Because the io_timeout of the IO queues can actually be modified via > sysfs, the following situation can occur: > > 1) NVME_IO_TIMEOUT = 30 (default module parameter) > 2) nvme1n1 is probed. IO queues default timeout is 30 s > 3) manually change the IO timeout to 90 s > echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout > 4) nvme zns report-zones /dev/nvme1n1 > This command issues IO commands with timeout 30 s instead of the > wanted 90 s which might be more suitable for this device. Does this example really use 30s, though? User space commands should be going through nvme_submit_user_cmd(), which overrides the timeout set from the nvme_init_request with whatever the user requested (usually 0). The code change looks fine, though.
On Tue, Dec 02, 2025 at 10:39:11AM -0700, Keith Busch wrote:
> On Tue, Dec 02, 2025 at 01:58:19PM +0000, Heyne, Maximilian wrote:
> > When initializing an nvme request which is about to be send to the block
> > layer, we do not need to initialize its timeout. If it's left
> > uninitialized at 0 the block layer will use the request queue's timeout
> > in blk_add_timer (via nvme_start_request which is called from
> > nvme_*_queue_rq). These timeouts are setup to either NVME_IO_TIMEOUT or
> > NVME_ADMIN_TIMEOUT when the request queues were created.
> >
> > Because the io_timeout of the IO queues can actually be modified via
> > sysfs, the following situation can occur:
> >
> > 1) NVME_IO_TIMEOUT = 30 (default module parameter)
> > 2) nvme1n1 is probed. IO queues default timeout is 30 s
> > 3) manually change the IO timeout to 90 s
> > echo 90000 > /sys/class/nvme/nvme1/nvme1n1/queue/io_timeout
> > 4) nvme zns report-zones /dev/nvme1n1
> > This command issues IO commands with timeout 30 s instead of the
> > wanted 90 s which might be more suitable for this device.
>
> Does this example really use 30s, though? User space commands should be
> going through nvme_submit_user_cmd(), which overrides the timeout set
> from the nvme_init_request with whatever the user requested (usually 0).
You're right. I actually worked on multiple (older) kernel versions and
forgot about this case. It was actually commit 470e900c8036ff ("nvme:
refactor nvme_alloc_request") which subtly changed the behavior but only
for the ioctl case. So ioctl's are fine then but, for example,
everything which goes via nvme_submit_sync_cmd shows the issue. So we
need to update the commit message accordingly. Sorry for that. I'll give
it a day or two for further comments on this patch and then send it with
a more correct message.
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Christof Hellmis
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
© 2016 - 2026 Red Hat, Inc.