drivers/nvme/host/core.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
When validating a namespace, nvme_update_ns_info()
would be skipped if nsid changed. However, this
happens everytime the in-use controller is
reattached if NID is bogus, causing nsid not being
restored to the previous one, eg /dev/nvme0n2 ->
/dev/nvme0n1.
Don't skip ns info updates on this circumstance.
Signed-off-by: Yihan Xin <xyh1996@gmail.com>
---
drivers/nvme/host/core.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1236e3aa00ed..c0875fb93b8d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3979,11 +3979,24 @@ static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_info *info)
int ret = NVME_SC_INVALID_NS | NVME_STATUS_DNR;
if (!nvme_ns_ids_equal(&ns->head->ids, &info->ids)) {
- dev_err(ns->ctrl->device,
- "identifiers changed for nsid %d\n", ns->head->ns_id);
- goto out;
+ /*
+ * Don't skip ns info updates if the NID is bogus as it
+ * changes everytime the in-use controller is reattached
+ * to the bus and thus the namespace is recognized as
+ * another one.
+ */
+ if (ns->ctrl->quirks & NVME_QUIRK_BOGUS_NID) {
+ dev_info(ns->ctrl->device,
+ "Ignoring nsid change for bogus ns\n");
+ } else {
+ dev_err(ns->ctrl->device,
+ "identifiers changed for nsid %d\n",
+ ns->head->ns_id);
+ goto out;
+ }
}
+
ret = nvme_update_ns_info(ns, info);
out:
/*
--
2.46.0
On Tue, Sep 10, 2024 at 05:50:06PM +0800, Yihan Xin wrote: > When validating a namespace, nvme_update_ns_info() > would be skipped if nsid changed. However, this > happens everytime the in-use controller is > reattached if NID is bogus, causing nsid not being > restored to the previous one, eg /dev/nvme0n2 -> > /dev/nvme0n1. What do you mean with restoring the nsid?
Dear Christoph and others, I mean that, for example, there is nsid 1 on the device, the kernel discovers it as nvme0n1. When the device had been reconnected (in my scenario, after resuming from an s2idle), the kernel would recognize the namespace as nsid 2, although it is still nsid 1 on the device, which makes the block device not found. As Hannes said, it shall be myself misunderstood the reason that leads to this situation. I will try looking into the real reason. Maybe there is something wrong in the controller? I am sorry for the inconvenience causing to you all. (Please forgive my poor misunderstanding, I'm very new to kernel development.) Best regards, Yihan Xin Christoph Hellwig 於 2024/9/12 17:40 寫道: > On Tue, Sep 10, 2024 at 05:50:06PM +0800, Yihan Xin wrote: >> When validating a namespace, nvme_update_ns_info() >> would be skipped if nsid changed. However, this >> happens everytime the in-use controller is >> reattached if NID is bogus, causing nsid not being >> restored to the previous one, eg /dev/nvme0n2 -> >> /dev/nvme0n1. > > What do you mean with restoring the nsid? >
On 9/10/24 11:50, Yihan Xin wrote: > When validating a namespace, nvme_update_ns_info() > would be skipped if nsid changed. However, this > happens everytime the in-use controller is > reattached if NID is bogus, causing nsid not being > restored to the previous one, eg /dev/nvme0n2 -> > /dev/nvme0n1. > > Don't skip ns info updates on this circumstance. > > Signed-off-by: Yihan Xin <xyh1996@gmail.com> > --- > drivers/nvme/host/core.c | 19 ++++++++++++++++--- > 1 file changed, 16 insertions(+), 3 deletions(-) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 1236e3aa00ed..c0875fb93b8d 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -3979,11 +3979,24 @@ static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_info *info) > int ret = NVME_SC_INVALID_NS | NVME_STATUS_DNR; > > if (!nvme_ns_ids_equal(&ns->head->ids, &info->ids)) { > - dev_err(ns->ctrl->device, > - "identifiers changed for nsid %d\n", ns->head->ns_id); > - goto out; > + /* > + * Don't skip ns info updates if the NID is bogus as it > + * changes everytime the in-use controller is reattached > + * to the bus and thus the namespace is recognized as > + * another one. > + */ > + if (ns->ctrl->quirks & NVME_QUIRK_BOGUS_NID) { > + dev_info(ns->ctrl->device, > + "Ignoring nsid change for bogus ns\n"); > + } else { > + dev_err(ns->ctrl->device, > + "identifiers changed for nsid %d\n", > + ns->head->ns_id); > + goto out; > + } > } > > + > ret = nvme_update_ns_info(ns, info); > out: > /* Nope. A namespace is identified by both, the NSID and the namespace identifiers (GUID, UUID, you name it). In our implementation we identify the namespace by NSID, and then validate that the identifers match. If you have a device which changes NSIDs you would need to swizzle that logic around, ie identify the namespace by GUID/UUID, and then check if the NSID matches. But you'll run into issues here, as the NSID is stored in head->ns_id, so if you attach a namespace with a different NSID you'll end up addressing the wrong namespace when sending commands. So this 'fix' is actually wrong ... Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
© 2016 - 2024 Red Hat, Inc.