From nobody Mon Feb 9 13:58:26 2026 Received: from mail-dy1-f174.google.com (mail-dy1-f174.google.com [74.125.82.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F060838A9CE for ; Fri, 30 Jan 2026 22:36:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769812602; cv=none; b=LlM/qsNgnUKuXTUimJua46yioGqmakz/QKQI6q01hH4l0QxrRez8/OyqQO+zzRW4IKkj0mF6v8C1/DPh9NgTKjbZwUdkME2rnymh5j1+57BiMkjM/UjH4USg2orN4kfugEgx5FhwGOj7bm0aIanH+FWl/iOIYKhQn+CUEezKo64= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769812602; c=relaxed/simple; bh=JrMt4s1tXI63Lv/Bo+8XTaWzdhJgsEhkmX1LEgLUMjs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eNggcbemNrndhnLTKgXkNtmzMzYVwmtXHRtEcJIj2DaND4tfj1LZfRE660wkNzbpwsw8IBzLT+fdHB/egn+ydNTUmdBkh/uauVY4YPhmBXJ+yuTvykUmI1TIicAqbE0BlmKjolOIBwx5GoJpm57zSjL/Tcdg2udimwt4YYxVrqU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=B6BDBDI2; arc=none smtp.client-ip=74.125.82.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="B6BDBDI2" Received: by mail-dy1-f174.google.com with SMTP id 5a478bee46e88-2b71515d8adso2691866eec.1 for ; Fri, 30 Jan 2026 14:36:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1769812594; x=1770417394; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zyhzdc6roNohqktXVF46V+5k877QRUEUXl/CXU7Oa7c=; b=B6BDBDI2/ajzP9tInfqi2RRrlME9AOCpVoof7m7vic50neiMVqt9iVrRm+EYXTBD2o xI8Bsy/+xR2v93OEouYt7UkyT9LkWG8GwZGFLSoMnQo9oxN77XPigQD0OKY7hQjsZHdm 2nZARlg9v29FoUH3H/EeOnX6aqvzC9vi4vbhbWEE5Ka9KxmixT0NyTX1Pe+TFSNu6DHx 30eAAwsuZgnlJfszU1eXkJE3XGzDOE0Bf6ntLLz4UANibJ1t1hYdyMdbiAQ8+axpBuuN HXTu16Bzq0ZgOINbma17KPUoQe2xR5hr/bnhP2S1JriOPb9E5mKyHExljxO0LCpQdJIy BKAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769812594; x=1770417394; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=zyhzdc6roNohqktXVF46V+5k877QRUEUXl/CXU7Oa7c=; b=VnWTYQS/f1XLsp25IQ8LFuiWVllPb3oHGWHahiR8huPojqJLuSqO7LgV3GLLHVAk0T 4NkVde5nm+oWe76alSrQ72HZRXcqkN/ddjje2pk4fwdwmfunNONMfTaR7+tI9ANaLsZw V1Y4DnCw7yEi5rxa50A++5U8P4ENPQdsE+2XIZMJK5dDTOqi14gB7TeiHo+lNYdkuZmZ Tci5zpIz8yZDrikG/7q9D2cCvGd93NwbtpJpcT4i/gkd82fBVnG70tkOlAMHWr6PeKdI Gcd6C9Il1zh/2aLLHI52aCSafFTDBHFGo61vS8G8w/Cv/bUIQQ6TB5OiU4BnnBVstoAG rUrg== X-Forwarded-Encrypted: i=1; AJvYcCX0x5vl3YzeGSH/fhiJFVon9riaK4biCLS66hJRVrqoEJF/y2sKpp88fihw6DFV5l9JEG+HN+AeChZ5Z5Y=@vger.kernel.org X-Gm-Message-State: AOJu0Yw4SBnNU+zpbkRou2Ou4Vr+z3WtG3vXt++RToMTJfdTyPPlVIv9 CMxHqcfOEOgdbLtjjkWLpKo/h8Cz/+K+tatnbaDokyCGb4WeoH+T3Cc+E0dBdv18SnM= X-Gm-Gg: AZuq6aL2PotLS5JyyuD7zaCfpzq/C5RMQoSrksKJ+0HoExdIHVTqP9WJj6N6/Gpsy8x Z2aOks/QCaj1gf2n4uXP3RXtQeFJHNGSCxLvkTggBSU6EmQuvNIO1hty9B3YAMbQ3nZOESkvyxG u9dcrXE3OW0BkzzViw+tVYLpgvgmHlCHOmL9kI3gG0yoFIV19D8o9zRpytTRIqrG104/KiCOgCj QGeogJrk7pJQe9rpzpWvNK1u3YHqytZ24OeOVvFOZJ9eo6WiIUOdLSWp20OPcQctCW3MnRXMobB +4MukU7IlidulGO7C7miCjGZk8BPy1c2mlnLhGYce/nwrE+oLCyRmcSk7qYrHRsJNqeYj+28P/A jUYAzTH7H90rWIh8fKuBg9W14qtzqDnGj7onEg2ePKlMLPVPtsRCPbM8YgdtwWvxx5iuInBgdI2 unhGHoNLsEq0vLnXChp6POAClv/bJyYJtTKA== X-Received: by 2002:a05:7300:7306:b0:2b7:2fff:ed1f with SMTP id 5a478bee46e88-2b7c86657admr2320612eec.17.1769812593797; Fri, 30 Jan 2026 14:36:33 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-124a9d6b906sm13161717c88.4.2026.01.30.14.36.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Jan 2026 14:36:33 -0800 (PST) From: Mohamed Khalfella To: Justin Tee , Naresh Gottumukkala , Paul Ely , Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , Dhaval Giani , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [PATCH v2 11/14] nvme-rdma: Use CCR to recover controller that hits an error Date: Fri, 30 Jan 2026 14:34:15 -0800 Message-ID: <20260130223531.2478849-12-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260130223531.2478849-1-mkhalfella@purestorage.com> References: <20260130223531.2478849-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An alive nvme controller that hits an error now will move to FENCING state instead of RESETTING state. ctrl->fencing_work attempts CCR to terminate inflight IOs. If CCR succeeds, switch to FENCED -> RESETTING and continue error recovery as usual. If CCR fails, the behavior depends on whether the subsystem supports CQT or not. If CQT is not supported then reset the controller immediately as if CCR succeeded in order to maintain the current behavior. If CQT is supported switch to time-based recovery. Schedule ctrl->fenced_work resets the controller when time based recovery finishes. Either ctrl->err_work or ctrl->reset_work can run after a controller is fenced. Flush fencing work when either work run. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/rdma.c | 62 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 35c0822edb2d..da45c9ea4f32 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -106,6 +106,8 @@ struct nvme_rdma_ctrl { =20 /* other member variables */ struct blk_mq_tag_set tag_set; + struct work_struct fencing_work; + struct delayed_work fenced_work; struct work_struct err_work; =20 struct nvme_rdma_qe async_event_sqe; @@ -1120,11 +1122,58 @@ static void nvme_rdma_reconnect_ctrl_work(struct wo= rk_struct *work) nvme_rdma_reconnect_or_remove(ctrl, ret); } =20 +static void nvme_rdma_fenced_work(struct work_struct *work) +{ + struct nvme_rdma_ctrl *rdma_ctrl =3D container_of(to_delayed_work(work), + struct nvme_rdma_ctrl, fenced_work); + struct nvme_ctrl *ctrl =3D &rdma_ctrl->ctrl; + + nvme_change_ctrl_state(ctrl, NVME_CTRL_FENCED); + if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) + queue_work(nvme_reset_wq, &rdma_ctrl->err_work); +} + +static void nvme_rdma_fencing_work(struct work_struct *work) +{ + struct nvme_rdma_ctrl *rdma_ctrl =3D container_of(work, + struct nvme_rdma_ctrl, fencing_work); + struct nvme_ctrl *ctrl =3D &rdma_ctrl->ctrl; + unsigned long rem; + + rem =3D nvme_fence_ctrl(ctrl); + if (!rem) + goto done; + + if (!ctrl->cqt) { + dev_info(ctrl->device, + "CCR failed, CQT not supported, skip time-based recovery\n"); + goto done; + } + + dev_info(ctrl->device, + "CCR failed, switch to time-based recovery, timeout =3D %ums\n", + jiffies_to_msecs(rem)); + queue_delayed_work(nvme_wq, &rdma_ctrl->fenced_work, rem); + return; + +done: + nvme_change_ctrl_state(ctrl, NVME_CTRL_FENCED); + if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) + queue_work(nvme_reset_wq, &rdma_ctrl->err_work); +} + +static void nvme_rdma_flush_fencing_work(struct nvme_rdma_ctrl *ctrl) +{ + flush_work(&ctrl->fencing_work); + flush_delayed_work(&ctrl->fenced_work); +} + static void nvme_rdma_error_recovery_work(struct work_struct *work) { struct nvme_rdma_ctrl *ctrl =3D container_of(work, struct nvme_rdma_ctrl, err_work); =20 + nvme_rdma_flush_fencing_work(ctrl); nvme_stop_keep_alive(&ctrl->ctrl); flush_work(&ctrl->ctrl.async_event_work); nvme_rdma_teardown_io_queues(ctrl, false); @@ -1147,6 +1196,12 @@ static void nvme_rdma_error_recovery_work(struct wor= k_struct *work) =20 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl) { + if (nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_FENCING)) { + dev_warn(ctrl->ctrl.device, "starting controller fencing\n"); + queue_work(nvme_wq, &ctrl->fencing_work); + return; + } + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) return; =20 @@ -1957,13 +2012,15 @@ static enum blk_eh_timer_return nvme_rdma_timeout(s= truct request *rq) struct nvme_rdma_ctrl *ctrl =3D queue->ctrl; struct nvme_command *cmd =3D req->req.cmd; int qid =3D nvme_rdma_queue_idx(queue); + enum nvme_ctrl_state state; =20 dev_warn(ctrl->ctrl.device, "I/O tag %d (%04x) opcode %#x (%s) QID %d timeout\n", rq->tag, nvme_cid(rq), cmd->common.opcode, nvme_fabrics_opcode_str(qid, cmd), qid); =20 - if (nvme_ctrl_state(&ctrl->ctrl) !=3D NVME_CTRL_LIVE) { + state =3D nvme_ctrl_state(&ctrl->ctrl); + if (state !=3D NVME_CTRL_LIVE && state !=3D NVME_CTRL_FENCING) { /* * If we are resetting, connecting or deleting we should * complete immediately because we may block controller @@ -2169,6 +2226,7 @@ static void nvme_rdma_reset_ctrl_work(struct work_str= uct *work) container_of(work, struct nvme_rdma_ctrl, ctrl.reset_work); int ret; =20 + nvme_rdma_flush_fencing_work(ctrl); nvme_stop_ctrl(&ctrl->ctrl); nvme_rdma_shutdown_ctrl(ctrl, false); =20 @@ -2281,6 +2339,8 @@ static struct nvme_rdma_ctrl *nvme_rdma_alloc_ctrl(st= ruct device *dev, =20 INIT_DELAYED_WORK(&ctrl->reconnect_work, nvme_rdma_reconnect_ctrl_work); + INIT_DELAYED_WORK(&ctrl->fenced_work, nvme_rdma_fenced_work); + INIT_WORK(&ctrl->fencing_work, nvme_rdma_fencing_work); INIT_WORK(&ctrl->err_work, nvme_rdma_error_recovery_work); INIT_WORK(&ctrl->ctrl.reset_work, nvme_rdma_reset_ctrl_work); =20 --=20 2.52.0