drivers/nvme/host/core.c | 45 ++++++++++++++++++---------------------- 1 file changed, 20 insertions(+), 25 deletions(-)
From: "Chunguang.xu" <chunguang.xu@shopee.com>
Now spdk will change UUID of ns while restarted if we have not
specified one. At this time, while host try to reconnected to target,
as UUID have changed, we will remove the old ns, but not add the ns
with the new UUID. As a result ctrl with no ns, and we need to
disconnect and connect to get the new ns. Here try to add ns with the
new UUID automatically.
Reported-by: Yingfu.zhou <yingfu.zhou@shopee.com>
Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com>
---
drivers/nvme/host/core.c | 45 ++++++++++++++++++----------------------
1 file changed, 20 insertions(+), 25 deletions(-)
V2: Add missed reporter
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 855b42c92284..425f59fc80d5 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3991,28 +3991,6 @@ static void nvme_ns_remove_by_nsid(struct nvme_ctrl *ctrl, u32 nsid)
}
}
-static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_info *info)
-{
- int ret = NVME_SC_INVALID_NS | NVME_STATUS_DNR;
-
- if (!nvme_ns_ids_equal(&ns->head->ids, &info->ids)) {
- dev_err(ns->ctrl->device,
- "identifiers changed for nsid %d\n", ns->head->ns_id);
- goto out;
- }
-
- ret = nvme_update_ns_info(ns, info);
-out:
- /*
- * Only remove the namespace if we got a fatal error back from the
- * device, otherwise ignore the error and just move on.
- *
- * TODO: we should probably schedule a delayed retry here.
- */
- if (ret > 0 && (ret & NVME_STATUS_DNR))
- nvme_ns_remove(ns);
-}
-
static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
{
struct nvme_ns_info info = { .nsid = nsid };
@@ -4051,11 +4029,28 @@ static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid)
ns = nvme_find_get_ns(ctrl, nsid);
if (ns) {
- nvme_validate_ns(ns, &info);
+ if (!nvme_ns_ids_equal(&ns->head->ids, &info.ids)) {
+ dev_err(ns->ctrl->device,
+ "identifiers changed for nsid %d\n", ns->head->ns_id);
+ nvme_ns_remove(ns);
+ nvme_put_ns(ns);
+ goto alloc;
+ }
+
+ ret = nvme_update_ns_info(ns, &info);
+ /*
+ * Only remove the namespace if we got a fatal error back from the
+ * device, otherwise ignore the error and just move on.
+ *
+ * TODO: we should probably schedule a delayed retry here.
+ */
+ if (ret > 0 && (ret & NVME_STATUS_DNR))
+ nvme_ns_remove(ns);
nvme_put_ns(ns);
- } else {
- nvme_alloc_ns(ctrl, &info);
+ return;
}
+ alloc:
+ nvme_alloc_ns(ctrl, &info);
}
/**
--
2.25.1
On Fri, Nov 15, 2024 at 04:37:27PM +0800, brookxu.cn wrote: > From: "Chunguang.xu" <chunguang.xu@shopee.com> > > Now spdk will change UUID of ns while restarted if we have not > specified one. At this time, while host try to reconnected to target, > as UUID have changed, we will remove the old ns, but not add the ns > with the new UUID. And that is broken behavior. The host must assume the namespace has been deleted and recreated when the eui/nguid/uuid change, and we need to catch this. Fix your broken target code instead.
Christoph Hellwig <hch@lst.de> 于2024年11月16日周六 01:04写道: > > On Fri, Nov 15, 2024 at 04:37:27PM +0800, brookxu.cn wrote: > > From: "Chunguang.xu" <chunguang.xu@shopee.com> > > > > Now spdk will change UUID of ns while restarted if we have not > > specified one. At this time, while host try to reconnected to target, > > as UUID have changed, we will remove the old ns, but not add the ns > > with the new UUID. > > And that is broken behavior. The host must assume the namespace has > been deleted and recreated when the eui/nguid/uuid change, and we need > to catch this. Yes, now we have remove the old ns and log the change to dmesg, but I am confused why not auto add the ns with new UUID, we should treat it as a new ns? so that we can avoid an active controller with no ns, but actually it have one. >Fix your broken target code instead. >
On Sat, Nov 16, 2024 at 08:49:12AM +0800, 许春光 wrote: > Yes, now we have remove the old ns and log the change to dmesg, but I am > confused why not auto add the ns with new UUID, we should treat it as a new > ns? so that we can avoid an active controller with no ns, but actually it have > one. Because as far as the specification is concerned it is. The whole point of these identifiers is that they are stable over the life time of the namespace.
Christoph Hellwig <hch@lst.de> 于2024年11月18日周一 14:26写道: > > On Sat, Nov 16, 2024 at 08:49:12AM +0800, 许春光 wrote: > > Yes, now we have remove the old ns and log the change to dmesg, but I am > > confused why not auto add the ns with new UUID, we should treat it as a new > > ns? so that we can avoid an active controller with no ns, but actually it have > > one. > > Because as far as the specification is concerned it is. The whole point of > these identifiers is that they are stable over the life time of the > namespace. Noted, Thanks very much~
© 2016 - 2024 Red Hat, Inc.