drivers/nvme/host/core.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
From: Alistair Francis <alistair.francis@wdc.com>
The NVMe Base Specification 2.1 states that:
"""
A host requests an explicit persistent connection ... by specifying a
non-zero Keep Alive Timer value in the Connect command.
"""
As such if we are starting a persistent connection to a discovery
controller and the KATO is currently 0 we need to update KATO to a non
zero value to avoid continuous timeouts on the target.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
drivers/nvme/host/core.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 812c1565114f..bb9685b67338 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4986,8 +4986,14 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
* checking that they started once before, hence are reconnecting back.
*/
if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
- nvme_discovery_ctrl(ctrl))
+ nvme_discovery_ctrl(ctrl)) {
+ if (!ctrl->kato) {
+ nvme_stop_keep_alive(ctrl);
+ ctrl->kato = NVME_DEFAULT_KATO;
+ nvme_start_keep_alive(ctrl);
+ }
nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
+ }
if (ctrl->queue_count > 1) {
nvme_queue_scan(ctrl);
--
2.50.1
On Tue, Sep 02, 2025 at 01:52:11PM +1000, alistair23@gmail.com wrote: > From: Alistair Francis <alistair.francis@wdc.com> > > The NVMe Base Specification 2.1 states that: > > """ > A host requests an explicit persistent connection ... by specifying a > non-zero Keep Alive Timer value in the Connect command. > """ > > As such if we are starting a persistent connection to a discovery > controller and the KATO is currently 0 we need to update KATO to a non > zero value to avoid continuous timeouts on the target. Thanks, applied to nvme-6.18.
Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, 2025-09-02 at 13:52 +1000, alistair23@gmail.com wrote: > From: Alistair Francis <alistair.francis@wdc.com> > > The NVMe Base Specification 2.1 states that: > > """ > A host requests an explicit persistent connection ... by specifying a > non-zero Keep Alive Timer value in the Connect command. > """ > > As such if we are starting a persistent connection to a discovery > controller and the KATO is currently 0 we need to update KATO to a > non > zero value to avoid continuous timeouts on the target. > > When would this ever happen? Won't nvme-cli & nvme/host/fabrics.c in the kernel ensure a PDC (persistent discovery controller) would always have the KATO either default set to NVMF_DEF_DISC_TMO (i.e. 30s) or any positive int value & not zero? Do you have a test log for the above scenario where the KATO ends up being zero for a PDC? -Martin
On Tue, Sep 2, 2025 at 8:35 PM Martin George <martinus.gpy@gmail.com> wrote: > > On Tue, 2025-09-02 at 13:52 +1000, alistair23@gmail.com wrote: > > From: Alistair Francis <alistair.francis@wdc.com> > > > > The NVMe Base Specification 2.1 states that: > > > > """ > > A host requests an explicit persistent connection ... by specifying a > > non-zero Keep Alive Timer value in the Connect command. > > """ > > > > As such if we are starting a persistent connection to a discovery > > controller and the KATO is currently 0 we need to update KATO to a > > non > > zero value to avoid continuous timeouts on the target. > > > > > > When would this ever happen? Won't nvme-cli & nvme/host/fabrics.c in It occurs if you perform a `nvme connect` to the discovery nqn of a Linux target. > the kernel ensure a PDC (persistent discovery controller) would always > have the KATO either default set to NVMF_DEF_DISC_TMO (i.e. 30s) or any > positive int value & not zero? The kernel doesn't set a default for discovery controllers (hence this patch). nvme-cli will only set NVMF_DEF_DISC_TMO if the `--persistent` connection option is supplied to `nvme discover`. But it doesn't set a KATO for `nvme connect`, even though it's a persistent connection. Note, that I think that is a bug in nvme-cli and it should be setting a non zero KATO. I plan on patching that. At the same time if the kernel knows it's a persistent discovery connection it should also be setting a default KATO. The kernel is already using a default non-zero KATO for non discovery nqns (see nvmf_parse_options()). This just extends the default to apply to persistent discovery controllers. > > Do you have a test log for the above scenario where the KATO ends up > being zero for a PDC? I do, it's just a lot of keep-alive timeout prints on the target Alistair > > -Martin
On 9/2/25 05:52, alistair23@gmail.com wrote: > From: Alistair Francis <alistair.francis@wdc.com> > > The NVMe Base Specification 2.1 states that: > > """ > A host requests an explicit persistent connection ... by specifying a > non-zero Keep Alive Timer value in the Connect command. > """ > > As such if we are starting a persistent connection to a discovery > controller and the KATO is currently 0 we need to update KATO to a non > zero value to avoid continuous timeouts on the target. > > Signed-off-by: Alistair Francis <alistair.francis@wdc.com> > --- > drivers/nvme/host/core.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
© 2016 - 2025 Red Hat, Inc.