From nobody Tue Dec 16 12:35:48 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E36E525E808 for ; Mon, 24 Mar 2025 12:08:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818091; cv=none; b=kRZUCOQ8R86Az/mG7ocydkmSzW+1HUYP9mxazvqznqOQ3IoVoOuewKNCPredxiKVF05x8Ug386CqLF2/M/tckbrSSqTxR5jl4hlMr56ibpGK8rF3moeEYiGuVG3f2u6L4SNePbsOThlkjagnmseKZA7gkmF/C36vAlXZJkiEAwg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818091; c=relaxed/simple; bh=MN4jerD6vCbfxLMpurlATwpB3u4sbGBPrL6S0ay1MPk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=a0HrhHJbn0U76IRcy2aN9DZRP1yh5qkGm2pjRk+HoKcbBHoCQW4VHspZ3O/gl76AyVSkVXjckVOAfdoeSjI21DTzDGKRXiXXfjnbeLypxpbnilYNldWcqUYbw7IaRegs4ylgGeG13x3QYXVdLeiI1y6kD6ne0QdqucGmWVoPeug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=D7uiS0r/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="D7uiS0r/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 31D81C4CEE9; Mon, 24 Mar 2025 12:08:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742818090; bh=MN4jerD6vCbfxLMpurlATwpB3u4sbGBPrL6S0ay1MPk=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=D7uiS0r/ql6VYylNLycK0St8J0Pi8ApnBeggQlE3SbNMgbbAe0G6Cw+HRK3m66+si dpPkH6BF8spn/9IdDsaYXjQJzhN7qt3L2xtUnpZQOmuCrv76Gkqzw+T8tbg0gnBqwO pvB4Hm92iBUJyAGGXkgy7Hk+Fn/n4ule7n+pfhBLRx8lrquVWbtbZ3qHvAzOyFpVci vW6FlqoaHZP2cW+NPlNnpiAg/pme9GjcuD7ukm4I7pt3VBDYnCrSUbYkvyLfHJlnnF 6N35ThM2i7rJD3HxaWym0ixHzhwpOXxiXYw4cOmtdSeBj9qQFrD0ZaMSVrX0xUw5iD JDCC08PAmzP4g== From: Daniel Wagner Date: Mon, 24 Mar 2025 13:07:56 +0100 Subject: [PATCH RFC 1/3] nvmet: add command quiesce time Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250324-tp4129-v1-1-95a747b4c33b@kernel.org> References: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> In-Reply-To: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> To: Christoph Hellwig , Sagi Grimberg , Keith Busch , Hannes Reinecke , John Meneghini , randyj@purestorage.com, Mohamed Khalfella Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Daniel Wagner X-Mailer: b4 0.14.2 TP4129 introduces Command Quiesce Time (CQT) for coordinating the shutdown sequence when for example KATO expires. Add support to nvmet but only report CQT is available but the controller doesn't need any additional time when shutting down. In this case the spec says nvmet should report a value of 1. Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- drivers/nvme/target/admin-cmd.c | 6 ++++++ drivers/nvme/target/nvmet.h | 1 + include/linux/nvme.h | 4 +++- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cm= d.c index e670dc185a967dc69c9b7d23930bb52bdcc3271a..09ac5a43f70dbe3889c1b404d6b= 59c0053337192 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -733,6 +733,12 @@ static void nvmet_execute_identify_ctrl(struct nvmet_r= eq *req) /* We support keep-alive timeout in granularity of seconds */ id->kas =3D cpu_to_le16(NVMET_KAS); =20 + /* + * Command Quiesce Time in milliseconds. If the controller is not + * need any quiencse time, the controller should set it to 1. + */ + id->cqt =3D cpu_to_le16(NVMET_CQT); + id->sqes =3D (0x6 << 4) | 0x6; id->cqes =3D (0x4 << 4) | 0x4; =20 diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index b540216c0c9a9138f0913f8df28fa5ae13c6397f..47ae8be6200054eaaad2dbacf23= db080bf0c56c2 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -671,6 +671,7 @@ bool nvmet_subsys_nsid_exists(struct nvmet_subsys *subs= ys, u32 nsid); =20 #define NVMET_KAS 10 #define NVMET_DISC_KATO_MS 120000 +#define NVMET_CQT 1 =20 int __init nvmet_init_configfs(void); void __exit nvmet_exit_configfs(void); diff --git a/include/linux/nvme.h b/include/linux/nvme.h index fe3b60818fdcfbb4baabce59f7499bc1fa07e855..983b047e7158dcb33da66a25c67= 684b8f1ef5a7e 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -335,7 +335,9 @@ struct nvme_id_ctrl { __u8 anacap; __le32 anagrpmax; __le32 nanagrpid; - __u8 rsvd352[160]; + __u8 rsvd352[34]; + __u16 cqt; + __u8 rsvd388[124]; __u8 sqes; __u8 cqes; __le16 maxcmd; --=20 2.48.1 From nobody Tue Dec 16 12:35:48 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C66B16F0FE for ; Mon, 24 Mar 2025 12:08:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818092; cv=none; b=O6fO2Fg6/RfPsmeKPLyEq5ktglCB4D3+aENuaMg70bifbmgX2jYrFEiaPHzBq/R4PuH1ZanmXw+pr0RuPM5k2Bt+EQ/rcdkChwAeLb3K/A1zOtOWX/900c+cE4SlblUnbzc5/L5EcXeH0Gt0ttZgiCPQRd9IgVOsnALrCw22r6Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818092; c=relaxed/simple; bh=C2XCpY3/qQ09F5paYDzzuRlhy0cJivYvdQwEXeEVWOw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gBajTLnJrh8h7j0EP8PExQn6S6YqqGUNdPTOk/N2fZO+uNW5hsq8AuQzjWBfW0SKN2Prkf+f1ajbe+KM8p+kBiSYLGJVB7469Q/V93c1TRDlhbYcc+6nLJrCxYgBAh61MU7YHvtcMzpD/i72lOwwsqfZNGt0g5b6DMZWu2ZijjA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZDChiOHZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZDChiOHZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A47B5C4CEE9; Mon, 24 Mar 2025 12:08:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742818091; bh=C2XCpY3/qQ09F5paYDzzuRlhy0cJivYvdQwEXeEVWOw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=ZDChiOHZujBnlf+IRJC/ofdiGpttxnijSwB1RPzx+zHQZpYesLEjDKulJ8T1KFgTG UEfeLq84e4CRqc8YbyflCsicnI47vkAMIx64RTJBNXBd5A8YhU/yr1fH4fv4FLSgzZ M6sbYuHGOs1jT3X3MUEjrZJCjgdJsZ9whq9y4qTeVGnSIA2AkZUAHvBuzBunUXS/Rs FY/7FLGY3cSTpo0gBSKQJOHQ+I27SGuG/5BjUxlAD+l59JWpJBf+EisrNMAsgcSlOA U3uTA4S8/3lkXYy648PHp4u3bzemcFQVv7C7okcPL0szDRZz5TB2bzgjxultYKhB5J w6z3zDmr4zFkA== From: Daniel Wagner Date: Mon, 24 Mar 2025 13:07:57 +0100 Subject: [PATCH RFC 2/3] nvme: store cqt value into nvme ctrl object Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250324-tp4129-v1-2-95a747b4c33b@kernel.org> References: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> In-Reply-To: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> To: Christoph Hellwig , Sagi Grimberg , Keith Busch , Hannes Reinecke , John Meneghini , randyj@purestorage.com, Mohamed Khalfella Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Daniel Wagner X-Mailer: b4 0.14.2 Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- drivers/nvme/host/core.c | 1 + drivers/nvme/host/nvme.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 40046770f1bf0b98261d8b80e21aa0cc04ebb7a0..135045528ea1c79eac0d6d47d5f= 7f05a7c98acc4 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3380,6 +3380,7 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl) ctrl->kas =3D le16_to_cpu(id->kas); ctrl->max_namespaces =3D le32_to_cpu(id->mnan); ctrl->ctratt =3D le32_to_cpu(id->ctratt); + ctrl->cqt =3D le16_to_cpu(id->cqt); =20 ctrl->cntrltype =3D id->cntrltype; ctrl->dctype =3D id->dctype; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 7be92d07430e950c3faa764514daf3808009e223..7563332b5b7b76fc6165ec8c6f2= d144737d4fe85 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -344,6 +344,7 @@ struct nvme_ctrl { u32 oaes; u32 aen_result; u32 ctratt; + u16 cqt; unsigned int shutdown_timeout; unsigned int kato; bool subsystem; --=20 2.48.1 From nobody Tue Dec 16 12:35:48 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBDB725EF8B for ; Mon, 24 Mar 2025 12:08:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818093; cv=none; b=kMJns3hvGV8j45UCd5o35voo7xAuIkGpzJ8R1BBo4F9UssiHUmeHXozXjEFZpO0GyQaPczvpBl4GRburgMV5ki0mdA450CKtN/5AJ4cBucxoQjDZawaBf/GDG+xSd+a4Sjq3d+EmXb400sbHfC0PnM1kLb78pzXqyWRq/nwNugo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742818093; c=relaxed/simple; bh=lQUzCkgG/COzR/pHntbS6EXiWn+eykoAt4Nb+HdA+EM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BJNtlRmhtQitfoWnXY16/0i1Ap1+SxxWh/A0aHrI720AVRy9XPD1jWLTZKHfa2eszkqiKlQT3/uY47oybWrWJlnvEslNtvrgYSp/f0gKhyaibHP5fjq5LSmEJFCzdDk7B5/c4GZJUPoYcKIrRbWysVJ45tRMbksL6FN71tt2s98= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AJZEk1Xm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AJZEk1Xm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1BBD0C4CEDD; Mon, 24 Mar 2025 12:08:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742818093; bh=lQUzCkgG/COzR/pHntbS6EXiWn+eykoAt4Nb+HdA+EM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=AJZEk1XmJh89GyNWx9VFhzXmUSRY4tcC16wczJgPmUNWl/TFU/ZWc1+EIVthB8QZj 4kgnQp+zCK9pkGeb8QSB0bVtxzZcdVAnXG83idm1y95ya4EzxgPpGlHJnmN1IHKZYm MNiAHVNVKwK7P+jFHMMS5C3nGPyOVWAoHOQnEdttwqB4tSHlOMmydblnc1ynIC/iWd dirP71mB2cSCGv5X982nfndBhM6tgfZZA+kYlUz9WFiMcS9K8wkh4SkenLvvBT+q+Q pI2FCLp4MFITPgJMbVWrgk/JEIFSEHXywaYDey3pROIDL82FN/4+P2+0V9/7+YMoXG /jKToq7rz5AsQ== From: Daniel Wagner Date: Mon, 24 Mar 2025 13:07:58 +0100 Subject: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250324-tp4129-v1-3-95a747b4c33b@kernel.org> References: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> In-Reply-To: <20250324-tp4129-v1-0-95a747b4c33b@kernel.org> To: Christoph Hellwig , Sagi Grimberg , Keith Busch , Hannes Reinecke , John Meneghini , randyj@purestorage.com, Mohamed Khalfella Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Daniel Wagner X-Mailer: b4 0.14.2 The TP4129 mendates that the failover should be delayed by CQT. Thus when nvme_decide_disposition returns FAILOVER do not immediately re-queue it on the namespace level instead queue it on the ctrl's request_list and moved later to the namespace's requeue_list. Signed-off-by: Daniel Wagner --- drivers/nvme/host/core.c | 19 ++++++++++++++++ drivers/nvme/host/fc.c | 4 ++++ drivers/nvme/host/multipath.c | 52 +++++++++++++++++++++++++++++++++++++++= +--- drivers/nvme/host/nvme.h | 15 +++++++++++++ drivers/nvme/host/rdma.c | 2 ++ drivers/nvme/host/tcp.c | 1 + 6 files changed, 90 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 135045528ea1c79eac0d6d47d5f7f05a7c98acc4..f3155c7735e75e06c4359c26db8= 931142c067e1d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -239,6 +239,7 @@ static void nvme_do_delete_ctrl(struct nvme_ctrl *ctrl) =20 flush_work(&ctrl->reset_work); nvme_stop_ctrl(ctrl); + nvme_flush_failover(ctrl); nvme_remove_namespaces(ctrl); ctrl->ops->delete_ctrl(ctrl); nvme_uninit_ctrl(ctrl); @@ -1310,6 +1311,19 @@ static void nvme_queue_keep_alive_work(struct nvme_c= trl *ctrl) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); } =20 +void nvme_schedule_failover(struct nvme_ctrl *ctrl) +{ + unsigned long delay; + + if (ctrl->cqt) + delay =3D msecs_to_jiffies(ctrl->cqt); + else + delay =3D ctrl->kato * HZ; + + queue_delayed_work(nvme_wq, &ctrl->failover_work, delay); +} +EXPORT_SYMBOL_GPL(nvme_schedule_failover); + static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, blk_status_t status) { @@ -1336,6 +1350,8 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(stru= ct request *rq, dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=3D%d\n", status); + + nvme_schedule_failover(ctrl); return RQ_END_IO_NONE; } =20 @@ -4716,6 +4732,7 @@ EXPORT_SYMBOL_GPL(nvme_remove_io_tag_set); =20 void nvme_stop_ctrl(struct nvme_ctrl *ctrl) { + nvme_schedule_failover(ctrl); nvme_mpath_stop(ctrl); nvme_auth_stop(ctrl); nvme_stop_failfast_work(ctrl); @@ -4842,6 +4859,8 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct dev= ice *dev, =20 INIT_DELAYED_WORK(&ctrl->ka_work, nvme_keep_alive_work); INIT_DELAYED_WORK(&ctrl->failfast_work, nvme_failfast_work); + INIT_DELAYED_WORK(&ctrl->failover_work, nvme_failover_work); + INIT_LIST_HEAD(&ctrl->failover_list); memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd)); ctrl->ka_cmd.common.opcode =3D nvme_admin_keep_alive; ctrl->ka_last_check_time =3D jiffies; diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index cdc1ba277a5c23ef1afd26e6911b082f3d12b215..bd897b29cd286008b781bbcb423= 0e08019da6b6b 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2553,6 +2553,8 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, cha= r *errmsg) { enum nvme_ctrl_state state =3D nvme_ctrl_state(&ctrl->ctrl); =20 + nvme_schedule_failover(&ctrl->ctrl); + /* * if an error (io timeout, etc) while (re)connecting, the remote * port requested terminating of the association (disconnect_ls) @@ -3378,6 +3380,8 @@ nvme_fc_reset_ctrl_work(struct work_struct *work) /* will block will waiting for io to terminate */ nvme_fc_delete_association(ctrl); =20 + nvme_schedule_failover(&ctrl->ctrl); + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) dev_err(ctrl->ctrl.device, "NVME-FC{%d}: error_recovery: Couldn't change state " diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 2a7635565083046c575efe1793362ae10581defd..a14b055796b982df96609f53174= a5d1334c1c0c4 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -86,9 +86,11 @@ void nvme_mpath_start_freeze(struct nvme_subsystem *subs= ys) void nvme_failover_req(struct request *req) { struct nvme_ns *ns =3D req->q->queuedata; + struct nvme_ctrl *ctrl =3D nvme_req(req)->ctrl; u16 status =3D nvme_req(req)->status & NVME_SCT_SC_MASK; unsigned long flags; struct bio *bio; + enum nvme_ctrl_state state =3D nvme_ctrl_state(ctrl); =20 nvme_mpath_clear_current_path(ns); =20 @@ -121,9 +123,53 @@ void nvme_failover_req(struct request *req) blk_steal_bios(&ns->head->requeue_list, req); spin_unlock_irqrestore(&ns->head->requeue_lock, flags); =20 - nvme_req(req)->status =3D 0; - nvme_end_req(req); - kblockd_schedule_work(&ns->head->requeue_work); + spin_lock_irqsave(&ctrl->lock, flags); + list_add_tail(&req->queuelist, &ctrl->failover_list); + spin_unlock_irqrestore(&ctrl->lock, flags); + + if (state =3D=3D NVME_CTRL_DELETING) { + /* + * request which fail over in the DELETING state were + * canceled and blk_mq_tagset_wait_completed_request will + * block until they have been proceed. Though + * nvme_failover_work is already stopped. Thus schedule + * a failover; it's still necessary to delay these commands + * by CQT. + */ + nvme_schedule_failover(ctrl); + } +} + +void nvme_flush_failover(struct nvme_ctrl *ctrl) +{ + LIST_HEAD(failover_list); + struct request *rq; + bool kick =3D false; + + spin_lock_irq(&ctrl->lock); + list_splice_init(&ctrl->failover_list, &failover_list); + spin_unlock_irq(&ctrl->lock); + + while (!list_empty(&failover_list)) { + rq =3D list_entry(failover_list.next, + struct request, queuelist); + list_del_init(&rq->queuelist); + + nvme_req(rq)->status =3D 0; + nvme_end_req(rq); + kick =3D true; + } + + if (kick) + nvme_kick_requeue_lists(ctrl); +} + +void nvme_failover_work(struct work_struct *work) +{ + struct nvme_ctrl *ctrl =3D container_of(to_delayed_work(work), + struct nvme_ctrl, failover_work); + + nvme_flush_failover(ctrl); } =20 void nvme_mpath_start_request(struct request *rq) diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 7563332b5b7b76fc6165ec8c6f2d144737d4fe85..10eb323bdaf139526959180c1e6= 6ab4579bb145d 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -411,6 +411,9 @@ struct nvme_ctrl { =20 enum nvme_ctrl_type cntrltype; enum nvme_dctype dctype; + + struct delayed_work failover_work; + struct list_head failover_list; }; =20 static inline enum nvme_ctrl_state nvme_ctrl_state(struct nvme_ctrl *ctrl) @@ -954,6 +957,9 @@ void nvme_mpath_wait_freeze(struct nvme_subsystem *subs= ys); void nvme_mpath_start_freeze(struct nvme_subsystem *subsys); void nvme_mpath_default_iopolicy(struct nvme_subsystem *subsys); void nvme_failover_req(struct request *req); +void nvme_failover_work(struct work_struct *work); +void nvme_schedule_failover(struct nvme_ctrl *ctrl); +void nvme_flush_failover(struct nvme_ctrl *ctrl); void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl); int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head= ); void nvme_mpath_add_disk(struct nvme_ns *ns, __le32 anagrpid); @@ -996,6 +1002,15 @@ static inline bool nvme_ctrl_use_ana(struct nvme_ctrl= *ctrl) static inline void nvme_failover_req(struct request *req) { } +static inline void nvme_failover_work(struct work_struct *work) +{ +} +static inline void nvme_schedule_failover(struct nvme_ctrl *ctrl) +{ +} +static inline void nvme_flush_failover(struct nvme_ctrl *ctrl) +{ +} static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl) { } diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 86a2891d9bcc7a990cd214a7fe93fa5c55b292c7..9bee376f881b4c3ebe5502abf23= a8e76829780ff 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1127,6 +1127,7 @@ static void nvme_rdma_error_recovery_work(struct work= _struct *work) =20 nvme_stop_keep_alive(&ctrl->ctrl); flush_work(&ctrl->ctrl.async_event_work); + nvme_schedule_failover(&ctrl->ctrl); nvme_rdma_teardown_io_queues(ctrl, false); nvme_unquiesce_io_queues(&ctrl->ctrl); nvme_rdma_teardown_admin_queue(ctrl, false); @@ -2153,6 +2154,7 @@ static const struct blk_mq_ops nvme_rdma_admin_mq_ops= =3D { =20 static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl, bool shut= down) { + nvme_schedule_failover(&ctrl->ctrl); nvme_rdma_teardown_io_queues(ctrl, shutdown); nvme_quiesce_admin_queue(&ctrl->ctrl); nvme_disable_ctrl(&ctrl->ctrl, shutdown); diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index d0023bcfd8a79a193adf2807a24481c8c164a174..3a6c1d3febaf233996e4dcf6847= 93327b5d1412f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2345,6 +2345,7 @@ static void nvme_tcp_error_recovery_work(struct work_= struct *work) =20 nvme_stop_keep_alive(ctrl); flush_work(&ctrl->async_event_work); + nvme_schedule_failover(ctrl); nvme_tcp_teardown_io_queues(ctrl, false); /* unquiesce to fail fast pending requests */ nvme_unquiesce_io_queues(ctrl); --=20 2.48.1