From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 870AB308F0A for ; Wed, 26 Nov 2025 02:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123205; cv=none; b=aDww0/gDshyRMWTEKnuxrkRwFyHvCtuttA/ZW7QqumcSqLtMBfJHE6JOOwC9OKIJsy/G08z/V+1m7XEdu8HtbEJXHvE67kR/WLAnMpDO2pTT1E1DLhQxVixEUEwRWLQl2QT0k8uCmYL2IlMbQbEfA/R+MW8S3YlDmOqrU+1WsK4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123205; c=relaxed/simple; bh=lmifS86Sm5eDcpfc1j4bH8U5iF7UgFk8snitIQIqc+U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RQiGZkv4q22FEDgLSPmI3rv5pJU1EU66CLh1uvO6EUm1yggwTVpOg1wr7rsXMduq6HTMCVzlbk9w4njRk7FaEf7lmUJrg+CjpUyOQAqmBlM0se3pUYDYNbW/ekGZifxqP8D28DYmHhtDIMwuwHb8rhaHkF4UdcssxwGFgCdHLJc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=Ejlr/eeK; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="Ejlr/eeK" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-298287a26c3so71846435ad.0 for ; Tue, 25 Nov 2025 18:13:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123203; x=1764728003; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+zJ+1UlEvFnxYO/+mlL/kPamED7cAsv3lc5btMMeFCc=; b=Ejlr/eeKQmuRROTiNRgJMvTLPvgspmW3FU1qb1Qt+3HjyUAum8hcg57NSDU7zvd4fv hn/abo5oDrEKL+QTpKGQpFA8MfYtOfFvaDN1ELRdDHGW1gmu0ddde5RmM6QFK+yV18ra ZbO63jGdqZ+LU0opSMf0ZqYrU50wolrL2zLykLjrp+V70YnhowEVBqHCR+GxT4uxEkvt qv1KiWBBdtJrRzutEFg5VgC9g2te8l7aeaf+6SpOzxwcajtNkHCp2W0q5OS7FoYJhbqO yG7CJXR8DyQmbj6ga3B54UqFbQW+Ueu+2Zn4H9QhfQTcwqaR7zdsRGhQa6bA4/QJveab 5ZSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123203; x=1764728003; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+zJ+1UlEvFnxYO/+mlL/kPamED7cAsv3lc5btMMeFCc=; b=LYQqlTr9nvh9aQTu97ct2FWZztSKBhIMsuzKHOFsWeNigYWB9Xf4UAPM540Um4NvGt O2t8NxJUxxuU10BLadxiG6Fvf7Omcv7sBjlpwlYahXtSvzRl6HeHQ7qst4l73A9Xor9P fzHvG7VEpV9hm4Vsh9JeEUji3wvg3KWNSq4hDln4ma/WPdrCqGh6un5eqJVgztahr6DQ BPEywYqWGTpDfmnx1en7f0kgq6XZW3au/Q02gjBiVnh0FvOhjy5NxtZQrqnEss6U6m+O xssw+Z5uDdXByxF+ETV2HR+YDy3tXrZD4EnO7op0gfS5dajr7LxxyaH8o1mHnE25z8Xk IK4Q== X-Forwarded-Encrypted: i=1; AJvYcCWpV8SVG7RgjEL3U0izIHAKHwYkasokV5luxWQhGlzi39mfzctcDqFhTIHVGX1Pxkbl6nZJu7/uWKU3BjQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yz8HB7TQWdTt7irBKtAjkyMlXp3E3g644FlKzQJ1d40VpYAirCL uo/g0T8D/t966SX6TC7TVPzzenMXTpUn2yI6lC4T5aq5ZFc6riNVzH2lEmIKlh1SY40= X-Gm-Gg: ASbGncsCeB/GO332q7dHprkmV23rFNtRxblzXNYCXSoj4GfUyaqOoXCu1YCzZWTmoaE FOvqitNVkRNXT2AOlxLzLme+CyR+8y1G6W9+aa2HB2DqaJdmHDh9KhRZvMpHmtw4MoGo3SFGQM2 OVBkPp2qlsmR13WTvF3ae/+v/w7oohDxV7n+xiL+3HMO7zs2wMhm9CNi7l3dQ5IzKn2SJe/4jQy JsPav+q4HGB/SnlTcqMEYmH1wOFNTzoSfiX/vCyoOXBLqk9nHTNgjSFd3q14zg2bZfjH2p11AgL uy+T4gdU+41AkI4huFg5D68rWIIqYwMXHB1ZdsnAUbnPxyq0OYBMhQfXpy3/WMgN6zdCo/GRhfN aX+LSOkb0SybUkHvqOWl+BQmRuMaUS3tEw7IsY0xVG+7euYFDjcD9nIg6QOcI6FJWiGL5EKC1TR eaQnIw3K1lNToppncFr2Flfwgln+mOyMtZaAXAdhtQqFKy X-Google-Smtp-Source: AGHT+IH3R2YXscEDzBRGNgHJXNj5sRMpFq3GXf09IFZdkJCTqRDyWOlwHS19A1MouExGkMqjcKV+Kg== X-Received: by 2002:a05:7022:90f:b0:11b:9b98:aa4b with SMTP id a92af1059eb24-11c9d60ea32mr11119676c88.6.1764123202326; Tue, 25 Nov 2025 18:13:22 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:22 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 01/14] nvmet: Rapid Path Failure Recovery set controller identify fields Date: Tue, 25 Nov 2025 18:11:48 -0800 Message-ID: <20251126021250.2583630-2-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" TP8028 Rapid Path Failure Recovery defined new fields in controller identify response. The newly defined fields are: - CIU (Controller Instance UNIQUIFIER): is an 8bit non-zero value that is assigned a random value when controller first created. The value is expected to be incremented when RDY bit in CSTS register is asserted - CIRN (Controller Instance Random Number): is 64bit random value that gets generated when controller is crated. CIRN is regenerated everytime RDY bit is CSTS register is asserted. - CCRL (Cross-Controller Reset Limit) is an 8bit value that defines the maximum number of in-progress controller reset operations. CCRL is hardcoded to 4 as recommended by TP8028. TP4129 KATO Corrections and Clarifications defined CQT (Command Quiesce Time) which is used along with KATO (Keep Alive Timeout) to set an upper time limit for attempting Cross-Controller Recovery. Make the new fields available for IO controllers only since TP8028 is not very useful for discovery controllers. Signed-off-by: Mohamed Khalfella --- drivers/nvme/target/admin-cmd.c | 6 ++++++ drivers/nvme/target/core.c | 9 +++++++++ drivers/nvme/target/nvmet.h | 2 ++ include/linux/nvme.h | 15 ++++++++++++--- 4 files changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cm= d.c index 3e378153a781..aaceb697e4d2 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -696,6 +696,12 @@ static void nvmet_execute_identify_ctrl(struct nvmet_r= eq *req) =20 id->cntlid =3D cpu_to_le16(ctrl->cntlid); id->ver =3D cpu_to_le32(ctrl->subsys->ver); + if (!nvmet_is_disc_subsys(ctrl->subsys)) { + id->cqt =3D NVMF_CQT_MS; + id->ciu =3D ctrl->uniquifier; + id->cirn =3D cpu_to_le64(ctrl->random); + id->ccrl =3D NVMF_CCR_LIMIT; + } =20 /* XXX: figure out what to do about RTD3R/RTD3 */ id->oaes =3D cpu_to_le32(NVMET_AEN_CFG_OPTIONAL); diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 5d7d483bfbe3..409928202503 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -1393,6 +1393,10 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) return; } =20 + if (!nvmet_is_disc_subsys(ctrl->subsys)) { + ctrl->uniquifier =3D ((u8)(ctrl->uniquifier + 1)) ? : 1; + ctrl->random =3D get_random_u64(); + } ctrl->csts =3D NVME_CSTS_RDY; =20 /* @@ -1662,6 +1666,11 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_all= oc_ctrl_args *args) } ctrl->cntlid =3D ret; =20 + if (!nvmet_is_disc_subsys(ctrl->subsys)) { + ctrl->uniquifier =3D get_random_u8() ? : 1; + ctrl->random =3D get_random_u64(); + } + /* * Discovery controllers may use some arbitrary high value * in order to cleanup stale discovery sessions diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 51df72f5e89b..4195c9eff1da 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -263,7 +263,9 @@ struct nvmet_ctrl { =20 uuid_t hostid; u16 cntlid; + u8 uniquifier; u32 kato; + u64 random; =20 struct nvmet_port *port; =20 diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 655d194f8e72..5135cdc3c120 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -21,6 +21,9 @@ #define NVMF_TRADDR_SIZE 256 #define NVMF_TSAS_SIZE 256 =20 +#define NVMF_CQT_MS 0 +#define NVMF_CCR_LIMIT 4 + #define NVME_DISC_SUBSYS_NAME "nqn.2014-08.org.nvmexpress.discovery" =20 #define NVME_NSID_ALL 0xffffffff @@ -328,7 +331,10 @@ struct nvme_id_ctrl { __le16 crdt1; __le16 crdt2; __le16 crdt3; - __u8 rsvd134[122]; + __u8 rsvd134[1]; + __u8 ciu; + __le64 cirn; + __u8 rsvd144[112]; __le16 oacs; __u8 acl; __u8 aerl; @@ -362,7 +368,9 @@ struct nvme_id_ctrl { __u8 anacap; __le32 anagrpmax; __le32 nanagrpid; - __u8 rsvd352[160]; + __u8 rsvd352[34]; + __le16 cqt; + __u8 rsvd388[124]; __u8 sqes; __u8 cqes; __le16 maxcmd; @@ -389,7 +397,8 @@ struct nvme_id_ctrl { __u8 msdbd; __u8 rsvd1804[2]; __u8 dctype; - __u8 rsvd1807[241]; + __u8 ccrl; + __u8 rsvd1808[240]; struct nvme_id_power_state psd[32]; __u8 vs[1024]; }; --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A7373090F1 for ; Wed, 26 Nov 2025 02:13:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123205; cv=none; b=fZeVN9ZtaqNufL1eS7dAxPPyT+8/bSGEnuAs67Pr6W2bdxdC2f41U0/apKXeg8Yh+6rOB3VpiL2sfzfKKnlibwZ5xDtgvcElwvG0AHcFBcULD9tv/YTwJ8eL1EmD3yhkgUJc2+UOzwRKclMeSppozjKqrpc/0xiHIdJaW5hBRk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123205; c=relaxed/simple; bh=uOxDrgogARaROreGw/O7QNtE20CnLHm+LZb/0cVAvdo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u1PIKNIcn7lPbmlyd3qtAz7K8nFx2JH7JLmV5K/duKDcI7++pamCVMCaM0LP+p44lc4MBfd14+RssNDlgtikMcTf9MgNNlkeUzFgoiRky6zp3wHHTivS2JwifK8DNz3ig/FljTwOdyMirkl7I4XkAZuLqyy4b7hf+3ipb5gXeCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=QiWRzKaz; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="QiWRzKaz" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-7bb3092e4d7so6727457b3a.0 for ; Tue, 25 Nov 2025 18:13:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123204; x=1764728004; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hb0T1pFUt+Fg9H0rtoLbOEnAOIT/fDYuF7o1ImhrgBk=; b=QiWRzKazbYykflQSBi0FIXTYFxRoVQO/EtTkUWpN/lxuJ4nC0mnNFnmCFR+mVxD8Gj CY4MexrMfuAlQ79IJMC75NDlwIu5fWlpX0EBtMoxRXkLv5tRGt2tUpH0NvQaWX1QNGIb Vb2IjoIVO6vqcXrui5IiTxTrawzjx9ZDARFdq2jg2P9PrSCyvgf1AVwC5brqbQePUNXl oTbL3GNf6RK9UgxuAHVgw7TQzKToPs4PbXA3enBBXzRxwwaZkSlRj0cLpZMKbqlkPhPn WKjEwqTqUaz8HoAf1e3kFGh0mHOiyzCOPrhZCIb4bbDQ0I7wPONAxU9zzJQBe3KPhvGs 8pSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123204; x=1764728004; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hb0T1pFUt+Fg9H0rtoLbOEnAOIT/fDYuF7o1ImhrgBk=; b=Gk4OsZJ6d8+GLY4MglIbwXH3QKMUHlY+9xyiJHx11NgTyHrWOPalJyHfI6vBZPf1mF WpJiWD5dCiQD9jndECv9w5VkJdUL0p6huAPHdt6DE2HddPBojRNPGUPyqt3SFUEpqAGp T958SdPJQFUQ6LDu35vkLv+of5qgbVgmPWY3VZwCkSqECO2s4pksYphFPS+z2ZgE8Muw 5mbub9uNBkikBTqN0zan7NaNFb9c5uaVMBh6SFyZOGp2j1YJCcW7prW1HdH4K2N9Mqs4 ARBWjghCnNAWSmVrcIga3WQm/mHH3neKIRc4O23C+ub/5PCYl/jcfmKE5/JIy+vS7RBd kuqw== X-Forwarded-Encrypted: i=1; AJvYcCXZSHsTYlPP1smzeOOZ/MmuMbWD1hhZWbRuAsJd6AAHPhjjh0vXmcYVaXjMs1JNgZykvL24heo5Nsosr4A=@vger.kernel.org X-Gm-Message-State: AOJu0YwKwXQVbOQKXGBOXLVvoCCD+n84qg6gQEe2XcZFxfzo0BmeV6Od v744Np/NzFmB7sLCD98/Rezjjpt2HkOR31JNBcrid9oyiGVp5PixkaJTkv8CDlaOoT+s06UEVAV EF8Hd X-Gm-Gg: ASbGnctZ0KeusKFGhyLhsItklAHrp6yHbTt/me4psgMWfzL5w4iYsXZmxPIKLO5+Irk 35M/C/ugRmaruXWzhi9dQWp617pxtY20KL8tKl206mktmBYK89xvgu9G4IVDFK9Z49E3mnik0o3 Jh3XXzwkX4hyYtZEgfh4xzKJwIHGaz13R3sH5Oas4rtidpIURhdmS2cmG4CuyM7vAmkA+Bj3yyr j0Lp9ZoDtPtpbpRtQNAa9JJJHPQSGoaUtmdHagsAEKZgO5kwyHL9m2p/KuLmIh4oO+wdBtZWwMm oJMh6r2BKkw575ro/NCOjcESCUmd6PTUScc8c8r66LfuOfG89VmF6SZR9qNY2LDkv9+Tnj5Mf44 q/2qzIya09HWnSyw8lXXFs5nrix2VZz2mb9CeAgkOMtiYKYpA7HdEfm4SaKYL6kMg//55wEYKCE NuSkaDjUw5tJE2l7frrb9s6ohdDxRIa+skHA== X-Google-Smtp-Source: AGHT+IGQTznDK4qX5kyXrrEYXRM0lRZjmQu4vbxK2XPJ2chF1SsUYfXs5xX8kyhOVg5GdCdI+Tb2vQ== X-Received: by 2002:a05:7022:3898:b0:119:e569:fb9d with SMTP id a92af1059eb24-11c9d7178ccmr12382169c88.12.1764123203252; Tue, 25 Nov 2025 18:13:23 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:22 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 02/14] nvmet/debugfs: Add ctrl uniquifier and random values Date: Tue, 25 Nov 2025 18:11:49 -0800 Message-ID: <20251126021250.2583630-3-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Export ctrl->random and ctrl->uniquifier as debugfs files under controller debugfs directory. Signed-off-by: Mohamed Khalfella --- drivers/nvme/target/debugfs.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/drivers/nvme/target/debugfs.c b/drivers/nvme/target/debugfs.c index 5dcbd5aa86e1..c983b1776ab8 100644 --- a/drivers/nvme/target/debugfs.c +++ b/drivers/nvme/target/debugfs.c @@ -152,6 +152,23 @@ static int nvmet_ctrl_tls_concat_show(struct seq_file = *m, void *p) } NVMET_DEBUGFS_ATTR(nvmet_ctrl_tls_concat); #endif +static int nvmet_ctrl_instance_uniquifier_show(struct seq_file *m, void *p) +{ + struct nvmet_ctrl *ctrl =3D m->private; + + seq_printf(m, "%02x\n", ctrl->uniquifier); + return 0; +} +NVMET_DEBUGFS_ATTR(nvmet_ctrl_instance_uniquifier); + +static int nvmet_ctrl_instance_random_show(struct seq_file *m, void *p) +{ + struct nvmet_ctrl *ctrl =3D m->private; + + seq_printf(m, "%016llx\n", ctrl->random); + return 0; +} +NVMET_DEBUGFS_ATTR(nvmet_ctrl_instance_random); =20 int nvmet_debugfs_ctrl_setup(struct nvmet_ctrl *ctrl) { @@ -184,6 +201,10 @@ int nvmet_debugfs_ctrl_setup(struct nvmet_ctrl *ctrl) debugfs_create_file("tls_key", S_IRUSR, ctrl->debugfs_dir, ctrl, &nvmet_ctrl_tls_key_fops); #endif + debugfs_create_file("uniquifier", S_IRUSR, ctrl->debugfs_dir, ctrl, + &nvmet_ctrl_instance_uniquifier_fops); + debugfs_create_file("random", S_IRUSR, ctrl->debugfs_dir, ctrl, + &nvmet_ctrl_instance_random_fops); return 0; } =20 --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 377833093DE for ; Wed, 26 Nov 2025 02:13:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123206; cv=none; b=C/NKmkPpOmr+dxiyhZkdov/X1DkjrWtWfCLZxUXy2Pt+RGwRDKFy15RcF6LATHTnxf/79q1gcktxCuRtuZwkFdOONz39cFZMgDga+HqY0i09lwfo6Xx1woJ1Do+3WKXA8WOyp1pigDRuk4WBcFrN9GuY/LDYeCBpsccu/uPF5qg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123206; c=relaxed/simple; bh=m7HncE7hjhDGiYwQsmDETG5E1qzwbIvyXgpxRikxB9I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Dvr3fsvdgCZz0hoQM8HuqD0+aQgRPLUVuP3rA1s/OdfQL01PuOAkxuFNwBDCaxh5105ja+PFGQvNo+sGpZHU329Iyfn58xmQjXW4PyheFp9xoZ1GzUAcldHTskdxe57LISAdX92I8nC68HQ7s/+GfpqTfS425fDyckYsH7St9Zg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=IclFKiIb; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="IclFKiIb" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-298039e00c2so87146455ad.3 for ; Tue, 25 Nov 2025 18:13:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123204; x=1764728004; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/6Ktrx/UQXxQ+qDpf8ETJ4IOmhLU/ttIqJExJUB/Co0=; b=IclFKiIbV1n4bSAgFqnxxIkhToyHvq1F5ThAotsUsiS/CNTuPhVKEm64wOiY4WkvQ1 jikbbjf44Sp2pMnHU5DaYbaBaawo+LJ3oplyhW3IxesViKkrzJzSuufHWAGNsdqhFeDU e7eetfwBv+2cSIdPTBnSfyYMcXI9GLyQVRwNtPwHaoQx6m3A4ly6MkGH52Lj+k4pXQ4E mtDqzv5wGs9Uy5RpqvqQa1vV9Fxt12lIapZc2ogj113QzxeT9y6F7wc5uKGG9ZKEzPmV qcOWFJwyfUtWHgcDdF4zXBuQFv7DkiPlZCI0EQqaNTbR3kdI+urenyGvrlHNW7uQS8la atVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123204; x=1764728004; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=/6Ktrx/UQXxQ+qDpf8ETJ4IOmhLU/ttIqJExJUB/Co0=; b=kdEfCQ0QvlzMZbBnQypw5VFsxfvbdQQYa7zw2i/6zBbkrkw1BV+900+eKAT605kbs/ Y/5rm+6NfcI9r1AEKd5X6RbVFDO4eRedPZdou9CKITkAxAh7lEsqufymDVZ77KMN3pPY VTlPaO+4UlXtOlHsQtgE+inIwyqQrBOEqj3nWjZlo3UrMdiLOBdQIFE6JnkcBWQlSZ5H gdhSIdYwpUsUA/fvQXw1PmdHSuNtNnSkvF/c6OHKNVQH11u4dlU5KmJh3VZXf7JT0d0G yELL7mdippAO0OwNkuyJGcRi3nX8roEWBiKEz0WjgLTUwQ3n9hvc/dEy8dXAWFJKpGXo juUQ== X-Forwarded-Encrypted: i=1; AJvYcCU6YM6ZTQWxjgyN+DpOkgtAT5Cabjm7yRJWQPMa0MEZ/5pN8K7IrFN2HEKI45Al9eibUYYJ4XXCp4ZCVGU=@vger.kernel.org X-Gm-Message-State: AOJu0YzjCWwXMrzVnthHxGqwlulr/b7eyFubnPWBTG65wfLBf2BuOfQp hI5ydmgySOEMs6WQP4pQib1CHmAj411QBbDvu8c5Bw58CtZiPZKFTloxZj8YOJDVlC4= X-Gm-Gg: ASbGncuccwR0IsPefcQ5c4v6Pv2ZatTtoUotVMuJjYOjEmG/exKRZKZohkbkdLBzV/0 wHFDyAXHTNQ5lsohnp1iJmnu/bYnqvfvTi/lzwfxjCOPbZTPcCSPTpL+ceU8K9FDCxnlYchfGI6 DYkuqI2OQNdrGnY8y43si+Owvj9joamuB6RhxH7i6sirHcDkxhyzavUOGBFDKQqrwI7hlmhKACC Qe9lpcD19NJd3yLpL1+C/TBMN06F1Zcbulm6i+rTTv7qrTDkMWu0f/QODNztLtGENNEYdLrCk86 08Wy/1I40YOZd8o4B3T8iV13bm5Flik+OWvuC5zDS7khFvM3i7cSuhfP2lSUv5U1K6We+u8Wi4f V4+sQSJcl7S425mSJ0st1Q1W4g5UxDf9Vkx9LzodNAnxLiHQhBQ5SouMJagr+lf0ca3FSW0DfOM M/hIHH2W3BXRyHosyujbzXZkW6FVu+fj6UhA== X-Google-Smtp-Source: AGHT+IGhlGJPpQ3+vlhHqQlj5aqUn4UV+qpX9NaUrwXRBSwz4d1DrNZ/KHjJ/B/28fdUcg/Ys0VOZQ== X-Received: by 2002:a05:7022:4297:b0:11b:9386:a381 with SMTP id a92af1059eb24-11c9d871990mr13646494c88.48.1764123204091; Tue, 25 Nov 2025 18:13:24 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:23 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 03/14] nvmet: Implement CCR nvme command Date: Tue, 25 Nov 2025 18:11:50 -0800 Message-ID: <20251126021250.2583630-4-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller Reset) command is an nvme command the is issued to source controller by initiator to reset impacted controller. Implement CCR command for linux nvme target. Signed-off-by: Mohamed Khalfella --- drivers/nvme/target/admin-cmd.c | 79 +++++++++++++++++++++++++++++++++ drivers/nvme/target/core.c | 69 ++++++++++++++++++++++++++++ drivers/nvme/target/nvmet.h | 13 ++++++ include/linux/nvme.h | 23 ++++++++++ 4 files changed, 184 insertions(+) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cm= d.c index aaceb697e4d2..a55ca010d34f 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -376,7 +376,9 @@ static void nvmet_get_cmd_effects_admin(struct nvmet_ct= rl *ctrl, log->acs[nvme_admin_get_features] =3D log->acs[nvme_admin_async_event] =3D log->acs[nvme_admin_keep_alive] =3D + log->acs[nvme_admin_cross_ctrl_reset] =3D cpu_to_le32(NVME_CMD_EFFECTS_CSUPP); + } =20 static void nvmet_get_cmd_effects_nvm(struct nvme_effects_log *log) @@ -1615,6 +1617,80 @@ void nvmet_execute_keep_alive(struct nvmet_req *req) nvmet_req_complete(req, status); } =20 +void nvmet_execute_cross_ctrl_reset(struct nvmet_req *req) +{ + struct nvmet_ctrl *ictrl, *ctrl =3D req->sq->ctrl; + struct nvme_command *cmd =3D req->cmd; + struct nvmet_ccr *ccr, *new_ccr; + int ccr_active, ccr_total; + u16 cntlid, status =3D 0; + + cntlid =3D le16_to_cpu(cmd->ccr.icid); + if (ctrl->cntlid =3D=3D cntlid) { + req->error_loc =3D + offsetof(struct nvme_cross_ctrl_reset_cmd, icid); + status =3D NVME_SC_INVALID_FIELD | NVME_STATUS_DNR; + goto out; + } + + ictrl =3D nvmet_ctrl_find_get_ccr(ctrl->subsys, ctrl->hostnqn, + cmd->ccr.ciu, cntlid, + le64_to_cpu(cmd->ccr.cirn)); + if (!ictrl) { + /* Immediate Reset Successful */ + nvmet_set_result(req, 1); + status =3D NVME_SC_SUCCESS; + goto out; + } + + new_ccr =3D kmalloc(sizeof(*ccr), GFP_KERNEL); + if (!new_ccr) { + status =3D NVME_SC_INTERNAL; + goto out_put_ctrl; + } + + ccr_total =3D ccr_active =3D 0; + mutex_lock(&ctrl->lock); + list_for_each_entry(ccr, &ctrl->ccrs, entry) { + if (ccr->ctrl =3D=3D ictrl) { + status =3D NVME_SC_CCR_IN_PROGRESS | NVME_STATUS_DNR; + goto out_unlock; + } + + ccr_total++; + if (ccr->ctrl) + ccr_active++; + } + + if (ccr_active >=3D NVMF_CCR_LIMIT) { + status =3D NVME_SC_CCR_LIMIT_EXCEEDED; + goto out_unlock; + } + if (ccr_total >=3D NVMF_CCR_PER_PAGE) { + status =3D NVME_SC_CCR_LOGPAGE_FULL; + goto out_unlock; + } + + new_ccr->ciu =3D cmd->ccr.ciu; + new_ccr->icid =3D cntlid; + new_ccr->ctrl =3D ictrl; + list_add_tail(&new_ccr->entry, &ctrl->ccrs); + mutex_unlock(&ctrl->lock); + + nvmet_ctrl_fatal_error(ictrl); + nvmet_ctrl_put(ictrl); + nvmet_req_complete(req, 0); + return; + +out_unlock: + mutex_unlock(&ctrl->lock); + kfree(new_ccr); +out_put_ctrl: + nvmet_ctrl_put(ictrl); +out: + nvmet_req_complete(req, status); +} + u32 nvmet_admin_cmd_data_len(struct nvmet_req *req) { struct nvme_command *cmd =3D req->cmd; @@ -1692,6 +1768,9 @@ u16 nvmet_parse_admin_cmd(struct nvmet_req *req) case nvme_admin_keep_alive: req->execute =3D nvmet_execute_keep_alive; return 0; + case nvme_admin_cross_ctrl_reset: + req->execute =3D nvmet_execute_cross_ctrl_reset; + return 0; default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 409928202503..7dbe9255ff42 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -114,6 +114,20 @@ u16 nvmet_zero_sgl(struct nvmet_req *req, off_t off, s= ize_t len) return 0; } =20 +void nvmet_ctrl_cleanup_ccrs(struct nvmet_ctrl *ctrl, bool all) +{ + struct nvmet_ccr *ccr, *tmp; + + lockdep_assert_held(&ctrl->lock); + + list_for_each_entry_safe(ccr, tmp, &ctrl->ccrs, entry) { + if (all || ccr->ctrl =3D=3D NULL) { + list_del(&ccr->entry); + kfree(ccr); + } + } +} + static u32 nvmet_max_nsid(struct nvmet_subsys *subsys) { struct nvmet_ns *cur; @@ -1396,6 +1410,7 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) if (!nvmet_is_disc_subsys(ctrl->subsys)) { ctrl->uniquifier =3D ((u8)(ctrl->uniquifier + 1)) ? : 1; ctrl->random =3D get_random_u64(); + nvmet_ctrl_cleanup_ccrs(ctrl, false); } ctrl->csts =3D NVME_CSTS_RDY; =20 @@ -1501,6 +1516,38 @@ struct nvmet_ctrl *nvmet_ctrl_find_get(const char *s= ubsysnqn, return ctrl; } =20 +struct nvmet_ctrl *nvmet_ctrl_find_get_ccr(struct nvmet_subsys *subsys, + const char *hostnqn, u8 ciu, + u16 cntlid, u64 cirn) +{ + struct nvmet_ctrl *ctrl; + bool found =3D false; + + mutex_lock(&subsys->lock); + list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) { + if (ctrl->cntlid !=3D cntlid) + continue; + if (strncmp(ctrl->hostnqn, hostnqn, NVMF_NQN_SIZE)) + continue; + + /* Avoid racing with a controller that is becoming ready */ + mutex_lock(&ctrl->lock); + if (ctrl->uniquifier =3D=3D ciu && ctrl->random =3D=3D cirn) + found =3D true; + mutex_unlock(&ctrl->lock); + + if (found) { + if (kref_get_unless_zero(&ctrl->ref)) + goto out; + break; + } + }; + ctrl =3D NULL; +out: + mutex_unlock(&subsys->lock); + return ctrl; +} + u16 nvmet_check_ctrl_status(struct nvmet_req *req) { if (unlikely(!(req->sq->ctrl->cc & NVME_CC_ENABLE))) { @@ -1626,6 +1673,7 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_allo= c_ctrl_args *args) subsys->clear_ids =3D 1; #endif =20 + INIT_LIST_HEAD(&ctrl->ccrs); INIT_WORK(&ctrl->async_event_work, nvmet_async_event_work); INIT_LIST_HEAD(&ctrl->async_events); INIT_RADIX_TREE(&ctrl->p2p_ns_map, GFP_KERNEL); @@ -1740,12 +1788,33 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_al= loc_ctrl_args *args) } EXPORT_SYMBOL_GPL(nvmet_alloc_ctrl); =20 +static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl) +{ + struct nvmet_subsys *subsys =3D ctrl->subsys; + struct nvmet_ctrl *sctrl; + struct nvmet_ccr *ccr; + + mutex_lock(&ctrl->lock); + nvmet_ctrl_cleanup_ccrs(ctrl, true); + mutex_unlock(&ctrl->lock); + + list_for_each_entry(sctrl, &subsys->ctrls, subsys_entry) { + mutex_lock(&sctrl->lock); + list_for_each_entry(ccr, &sctrl->ccrs, entry) { + if (ccr->ctrl =3D=3D ctrl) + ccr->ctrl =3D NULL; + } + mutex_unlock(&sctrl->lock); + } +} + static void nvmet_ctrl_free(struct kref *ref) { struct nvmet_ctrl *ctrl =3D container_of(ref, struct nvmet_ctrl, ref); struct nvmet_subsys *subsys =3D ctrl->subsys; =20 mutex_lock(&subsys->lock); + nvmet_ctrl_complete_pending_ccr(ctrl); nvmet_ctrl_destroy_pr(ctrl); nvmet_release_p2p_ns_map(ctrl); list_del(&ctrl->subsys_entry); diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 4195c9eff1da..6c0091b8af8b 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -267,6 +267,7 @@ struct nvmet_ctrl { u32 kato; u64 random; =20 + struct list_head ccrs; struct nvmet_port *port; =20 u32 aen_enabled; @@ -314,6 +315,13 @@ struct nvmet_ctrl { struct nvmet_pr_log_mgr pr_log_mgr; }; =20 +struct nvmet_ccr { + struct nvmet_ctrl *ctrl; + struct list_head entry; + u16 icid; + u8 ciu; +}; + struct nvmet_subsys { enum nvme_subsys_type type; =20 @@ -576,6 +584,7 @@ void nvmet_req_free_sgls(struct nvmet_req *req); void nvmet_execute_set_features(struct nvmet_req *req); void nvmet_execute_get_features(struct nvmet_req *req); void nvmet_execute_keep_alive(struct nvmet_req *req); +void nvmet_execute_cross_ctrl_reset(struct nvmet_req *req); =20 u16 nvmet_check_cqid(struct nvmet_ctrl *ctrl, u16 cqid, bool create); u16 nvmet_check_io_cqid(struct nvmet_ctrl *ctrl, u16 cqid, bool create); @@ -618,6 +627,10 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_alloc= _ctrl_args *args); struct nvmet_ctrl *nvmet_ctrl_find_get(const char *subsysnqn, const char *hostnqn, u16 cntlid, struct nvmet_req *req); +struct nvmet_ctrl *nvmet_ctrl_find_get_ccr(struct nvmet_subsys *subsys, + const char *hostnqn, u8 ciu, + u16 cntlid, u64 cirn); +void nvmet_ctrl_cleanup_ccrs(struct nvmet_ctrl *ctrl, bool all); void nvmet_ctrl_put(struct nvmet_ctrl *ctrl); u16 nvmet_check_ctrl_status(struct nvmet_req *req); ssize_t nvmet_ctrl_host_traddr(struct nvmet_ctrl *ctrl, diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 5135cdc3c120..0f305b317aa3 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -23,6 +23,7 @@ =20 #define NVMF_CQT_MS 0 #define NVMF_CCR_LIMIT 4 +#define NVMF_CCR_PER_PAGE 511 =20 #define NVME_DISC_SUBSYS_NAME "nqn.2014-08.org.nvmexpress.discovery" =20 @@ -1225,6 +1226,22 @@ struct nvme_zone_mgmt_recv_cmd { __le32 cdw14[2]; }; =20 +struct nvme_cross_ctrl_reset_cmd { + __u8 opcode; + __u8 flags; + __u16 command_id; + __le32 nsid; + __le64 rsvd2[2]; + union nvme_data_ptr dptr; + __u8 rsvd10; + __u8 ciu; + __le16 icid; + __le32 cdw11; + __le64 cirn; + __le32 cdw14; + __le32 cdw15; +}; + struct nvme_io_mgmt_recv_cmd { __u8 opcode; __u8 flags; @@ -1323,6 +1340,7 @@ enum nvme_admin_opcode { nvme_admin_virtual_mgmt =3D 0x1c, nvme_admin_nvme_mi_send =3D 0x1d, nvme_admin_nvme_mi_recv =3D 0x1e, + nvme_admin_cross_ctrl_reset =3D 0x38, nvme_admin_dbbuf =3D 0x7C, nvme_admin_format_nvm =3D 0x80, nvme_admin_security_send =3D 0x81, @@ -1356,6 +1374,7 @@ enum nvme_admin_opcode { nvme_admin_opcode_name(nvme_admin_virtual_mgmt), \ nvme_admin_opcode_name(nvme_admin_nvme_mi_send), \ nvme_admin_opcode_name(nvme_admin_nvme_mi_recv), \ + nvme_admin_opcode_name(nvme_admin_cross_ctrl_reset), \ nvme_admin_opcode_name(nvme_admin_dbbuf), \ nvme_admin_opcode_name(nvme_admin_format_nvm), \ nvme_admin_opcode_name(nvme_admin_security_send), \ @@ -2009,6 +2028,7 @@ struct nvme_command { struct nvme_dbbuf dbbuf; struct nvme_directive_cmd directive; struct nvme_io_mgmt_recv_cmd imr; + struct nvme_cross_ctrl_reset_cmd ccr; }; }; =20 @@ -2173,6 +2193,9 @@ enum { NVME_SC_PMR_SAN_PROHIBITED =3D 0x123, NVME_SC_ANA_GROUP_ID_INVALID =3D 0x124, NVME_SC_ANA_ATTACH_FAILED =3D 0x125, + NVME_SC_CCR_IN_PROGRESS =3D 0x13f, + NVME_SC_CCR_LOGPAGE_FULL =3D 0x140, + NVME_SC_CCR_LIMIT_EXCEEDED =3D 0x141, =20 /* * I/O Command Set Specific - NVM commands: --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-dl1-f48.google.com (mail-dl1-f48.google.com [74.125.82.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 434E030BBAC for ; Wed, 26 Nov 2025 02:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123207; cv=none; b=FVLt1om8UfnZ2IHQCb9AMdaB3NY2IcQpm5lN69/2pID/vgw4a/D1y9cliPBJuUKi9CVhhFUnUQTMtXRQDnBnW8lNg4r9Q/E7XAipsl7t9sqHjCR7VdslKW6yMSoiFx8+f2BMU2bIm/7s3KtmEhs42cC/iQaSR7dMRzwEwLL3xm0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123207; c=relaxed/simple; bh=u7W20si0SUqps+5PlbgF3xoCTUVA5ja/Q7/V9V/8FlU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kaFrzQ8pB2U87UMFebF5UnSNJft4cQOO/CijztISHu3Iuai1+Kue9Gmie6stILnNhUDtoKGHE/G721AswFwNr+hL4SRy00CujiJCtcBl3T2tuIytDA35AwZOJJwrNAUhIUDmv59d0w86iJMPl9jzsIQxL1Y9zvGt2c3y8xu+TtU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=MsxM9byn; arc=none smtp.client-ip=74.125.82.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="MsxM9byn" Received: by mail-dl1-f48.google.com with SMTP id a92af1059eb24-11b6bc976d6so555517c88.0 for ; Tue, 25 Nov 2025 18:13:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123205; x=1764728005; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=myVWmyIcrLWM/1NcPstHQcp5kFjs3Rj2+YxkwTil04s=; b=MsxM9bynFIWdF6K0CK2ZR7rsVwHfcbkRSCQlyDSGo/JLxnJ7PxSI5DaMC+j6kinQxT cLjfoN802V8GRvawpYFeP2IuudapDvK2xjVt8lLAZ/hZqXbw+F/Xv3+rhv3pSBi1Z+c/ 5LrsYh28bWfF+c4B24z2j1//J262ORpem89wQVa7hxG0Y0vTIl/s74+9KKY1D7WuNTI8 SlsNpU3+8c6P4q3BgVzhtuz8+GgKyIx60a/wKlM+jg4QGDA2etys36wD1hPr5aO3iqQb BtW/Hy9MFtNN8XqMJftmhlc+zDint8zbGC3IvGswD+cEWlPRAZnkWYcz6rEEqrRb7VlM O5oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123205; x=1764728005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=myVWmyIcrLWM/1NcPstHQcp5kFjs3Rj2+YxkwTil04s=; b=haQjtbKiGANAAsNiKSJLylCR0MnNjx6bn3yPbhZWnhmtdyHS+JDMjxrza9l0XZa9e/ 8cxSfiT2tOy5y0OOL9zBP6WG132MqxG0F7oAJLK3nNWyzhbJf6CizCYzi38uoaeqb+H+ S0lE0LSBHCQpg2/7hZWchqJ7Qn94qgDddRiQ3rznF1VLVb2WDz/D4tol8xW/dkUWmVqD OkEJ9ukjZyAjqbvdB8iEN5JAvSW58mUJ8AouiNbQJ6MXgQ6K13KZ4dGLfOzgUl2xjpKb kZQ54n/rMe+50b8ueRp1PKB1X5xZfN4+aYRuWwvUasJS+SKiyEirzr5lMiUXSY2M10oT CesQ== X-Forwarded-Encrypted: i=1; AJvYcCVKsDPZkFq4jzrtfk8dqYXQ6o5IxLDRR3Tr6tqVAEKh6Yc6a9483rS7Xm/VChelJXHyu1gI3X7F1H5TMvE=@vger.kernel.org X-Gm-Message-State: AOJu0YwJFHyZXk2Wtnz+tV3XUkxMKcU/w5yjeTvqISs68fEMqWb/iBvW 3rZWCFX0f7Jt/VnRmXOutkwhMWm9llOJsNPt3LTUUXvOGi88ArPhp5J6yVWyiUmEpza82Nnf9mG nMkte X-Gm-Gg: ASbGncvpfHa+ekU9RO82/SEd7Q2qrMepA4SxWFXtlcwamt7NOEF56wMQGu7azPX7qQs qD59JSxVUMMD39yYh4RAU+7Cm7BpSCl1UvfQZS21vPzRtsFsYK32m9SVV4dH915s2A+lDOlCEIy QxmwH+8Q0zaU3z+AJnDWi0rdJ9nKGimm5rfYPg5DKnBXU4bNlrhZBWt6vgSCXzr+slULzknQ6vq lQB5xSX1E0Nai7j5bAVg+dBe+RI1Y9Odu5FI2SDBlHoqfDqlJxelQPqQ/gilpyY/LE7JZv7MQI4 y4z7bEU0nVZ7vJibowrx9IaSNobwdKAhcr+9nxUKF0GGwskhs9S1QzwFgxu1fzaiG6tdj2zYHjR aQ6oST+1YQ/MDmkBezA6minGwFk/anY9uM8fQ5R9zC9hB6zKrepP/6LKBAMgt3nXjViX2L1PX/F 7wcFQZ4lnPbNYKYCIcmgj411dYJ/5Wy8RixA== X-Google-Smtp-Source: AGHT+IF762dvIfiy+MJfaaRdfezo6gJCeNNNIF95vwqjmCbVZcZEuTz+XQbUmdcGAbK33RotfljyPQ== X-Received: by 2002:a05:7022:ea53:b0:11b:7dcd:ca8d with SMTP id a92af1059eb24-11c9caaf0cfmr7562603c88.19.1764123204952; Tue, 25 Nov 2025 18:13:24 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:24 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 04/14] nvmet: Implement CCR logpage Date: Tue, 25 Nov 2025 18:11:51 -0800 Message-ID: <20251126021250.2583630-5-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Defined by TP8028 Rapid Path Failure Recovery, CCR (Cross-Controller Reset) log page contains an entry for each CCR request submitted to source controller. Implement CCR logpage for nvme linux target. Signed-off-by: Mohamed Khalfella --- drivers/nvme/target/admin-cmd.c | 42 +++++++++++++++++++++++++++++++++ include/linux/nvme.h | 16 +++++++++++++ 2 files changed, 58 insertions(+) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cm= d.c index a55ca010d34f..d2892354bf81 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -220,6 +220,7 @@ static void nvmet_execute_get_supported_log_pages(struc= t nvmet_req *req) logs->lids[NVME_LOG_FEATURES] =3D cpu_to_le32(NVME_LIDS_LSUPP); logs->lids[NVME_LOG_RMI] =3D cpu_to_le32(NVME_LIDS_LSUPP); logs->lids[NVME_LOG_RESERVATION] =3D cpu_to_le32(NVME_LIDS_LSUPP); + logs->lids[NVME_LOG_CCR] =3D cpu_to_le32(NVME_LIDS_LSUPP); =20 status =3D nvmet_copy_to_sgl(req, 0, logs, sizeof(*logs)); kfree(logs); @@ -608,6 +609,45 @@ static void nvmet_execute_get_log_page_features(struct= nvmet_req *req) nvmet_req_complete(req, status); } =20 +static void nvmet_execute_get_log_page_ccr(struct nvmet_req *req) +{ + struct nvmet_ctrl *ctrl =3D req->sq->ctrl; + struct nvmet_ccr *ccr; + struct nvme_ccr_log *log; + int index =3D 0; + u16 status; + + log =3D kzalloc(sizeof(*log), GFP_KERNEL); + if (!log) { + status =3D NVME_SC_INTERNAL; + goto out; + } + + mutex_lock(&ctrl->lock); + list_for_each_entry(ccr, &ctrl->ccrs, entry) { + log->entries[index].icid =3D cpu_to_le16(ccr->icid); + log->entries[index].ciu =3D ccr->ciu; + log->entries[index].acid =3D cpu_to_le16(0xffff); + + /* If ccr->ctrl is NULL then we know reset succeeded */ + log->entries[index].ccrs =3D ccr->ctrl ? 0x00 : 0x01; + log->entries[index].ccrf =3D 0x03; /* Validated and Initiated */ + index++; + } + + /* Cleanup completed CCRs if requested */ + if (req->cmd->get_log_page.lsp & 0x1) + nvmet_ctrl_cleanup_ccrs(ctrl, false); + mutex_unlock(&ctrl->lock); + + log->ne =3D cpu_to_le16(index); + nvmet_clear_aen_bit(req, NVME_AEN_BIT_CCR_COMPLETE); + status =3D nvmet_copy_to_sgl(req, 0, log, sizeof(*log)); + kfree(log); +out: + nvmet_req_complete(req, status); +} + static void nvmet_execute_get_log_page(struct nvmet_req *req) { if (!nvmet_check_transfer_len(req, nvmet_get_log_page_len(req->cmd))) @@ -641,6 +681,8 @@ static void nvmet_execute_get_log_page(struct nvmet_req= *req) return nvmet_execute_get_log_page_rmi(req); case NVME_LOG_RESERVATION: return nvmet_execute_get_log_page_resv(req); + case NVME_LOG_CCR: + return nvmet_execute_get_log_page_ccr(req); } pr_debug("unhandled lid %d on qid %d\n", req->cmd->get_log_page.lid, req->sq->qid); diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 0f305b317aa3..d51883122d65 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -1435,6 +1435,7 @@ enum { NVME_LOG_FDP_CONFIGS =3D 0x20, NVME_LOG_DISC =3D 0x70, NVME_LOG_RESERVATION =3D 0x80, + NVME_LOG_CCR =3D 0x1E, NVME_FWACT_REPL =3D (0 << 3), NVME_FWACT_REPL_ACTV =3D (1 << 3), NVME_FWACT_ACTV =3D (2 << 3), @@ -1458,6 +1459,21 @@ enum { NVME_FIS_CSCPE =3D 1 << 21, }; =20 +struct nvme_ccr_log_entry { + __le16 icid; + __u8 ciu; + __u8 rsvd3; + __le16 acid; + __u8 ccrs; + __u8 ccrf; +}; + +struct nvme_ccr_log { + __le16 ne; + __u8 rsvd2[6]; + struct nvme_ccr_log_entry entries[NVMF_CCR_PER_PAGE]; +}; + /* NVMe Namespace Write Protect State */ enum { NVME_NS_NO_WRITE_PROTECT =3D 0, --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C59AB30BF67 for ; Wed, 26 Nov 2025 02:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123208; cv=none; b=EmwrEdur2moyqG7L4V9P+8kVuktTbbl9M5lvgBXiR4svSFg/15Veo0TPNP3gxRo/aaUJFSWvD73OipQDtm6/MQ7JUAlgOEXVg7plD2HlgSscrS4taziQCuPLsuXsMG/xRS1VBy4sYfZH6nqqZi5HyjlwYF+HNU0WQXSXo+cnD6A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123208; c=relaxed/simple; bh=ObajzURZPpq5CwZKMTFWSecxl+UxPzyGjYPnyHSwBvE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XGcLFb1bqkO5oH7mc9uRXEbZ5joClOK3qSStUwKtttNLzFQszttmPNMcMZOw68S9++5+tkwhJhNqaRFL7dRet//e3hGj5gx5NFkB0kpti96TSn8Nd4LkoA83jaABhcsQsgtPVewsdPDzMi7JhjQrQaW2Iim/aj5qhWkmhiwHU3c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=NRHYdpE4; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="NRHYdpE4" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-7aae5f2633dso7184683b3a.3 for ; Tue, 25 Nov 2025 18:13:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123206; x=1764728006; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8XbXxl9w54eoJ5bw9vwLfDrvdJe4OVmNRP09snH5Q98=; b=NRHYdpE4/vC9GuLUW9HmZeqInkDlkWSPKJbetAhT1/8s6lopKnOs8V0roJdiUIPPUQ +vDlrI0JyhrDv71dbGMKwK4lmOE2VH+Kd/00E4a9TMAlK0Xbii9kcI3FWFB3xy8f0+0u IkZYduBrzlnBxfd4keYhqoQuRhUqdcplf7H6ohtS+fFaeCgCJQqzPoPziBwyTJlhL5xZ 7DsAixzVDfTmSNSvMw2jfyHA2OrHbt+Q4D+kGn4VMzKSu3gAg2b4Xqbl3uobHOqpIBX4 Pd8B/ukoO449t2Yd6bdBTDzmHv7wbKH+JiLXlnxPZLualUtk/U1ibNGPYmirDxWCIjUV y8CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123206; x=1764728006; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=8XbXxl9w54eoJ5bw9vwLfDrvdJe4OVmNRP09snH5Q98=; b=MYHtUM9ckoMpvjHHqf7O8D9MUz+lBNK8Jg8FG5M2xYURV0tH8UMeKhjLSOVDBRCqQP 7do6j1TJoM03qamQOvZ9inKJ5DabJiRZti0/BdCEt8+Wd11daT5113OeGtK5MqKW6m2h +jN4s0cSvaALzTtl7+1D+P1uw4dAbQomCAg9mbyjUWtP7YXz5rZDNohj+Y6a85LoFDVQ hLbjmgSdjfztlmk+gJZFxtmbjpQMhZh5EuiJDM6JTvc+fQgoGJZoyWH/wUmErnlhUUMm kpbbCt7o4TtN9Vcbq3ww7HzeQWSMw0gVezMUppTw1WkSNi858x02F3Z2Hwkb026dJbBl tAjA== X-Forwarded-Encrypted: i=1; AJvYcCVWIM1OmqE4nsfhJm6jCFpW849MUvaI+Zw8ZLhkG38XA4IIsA1iJ/XKQJr+88ZGBVl6hm3cvLUYptLO5xM=@vger.kernel.org X-Gm-Message-State: AOJu0Ywr0rAa0Vt+o8LaLx0vhT/uP25/Hpw6WuYbo6xvMphj0DxIh6gd qmi9zY/k6UlgOY4os+34ToNiXiM818BUKuH+RhAFT5errmSt1mdSGLJHYA4dmhNwfGA= X-Gm-Gg: ASbGnctNEkc1HdZH/pbIWRCm4E/cdxsABNJDk6fult4E6AKcXq/jy8UN0lIXrDFPW1w fnC5yJhIHvH/75iDOOG0/T9H+wna5JlnfQ39bOVNN26Nq3LGAWAXa416L+g0BJYRrt/zAN94Ld5 /ZjCE1eKerTz8UdJaAF5rK2y34XAc/PCMvEfihfhcdc4Jhm66X1mUSLiUiARTkHz78NIjRpIP9f E2o9IcseRKpXUUVHOTHteHDeBj6ILFFh6bi1In/8nYlklalgcvBBa0zqHqRIGL/o6wRr+5+4ZYb UyJb5cItGhMXJ6xnN5VcRDYd0Yq8z59zBX+BTRdkf8dNytHGOafWhBTKZCqGpRSOr0dPoPIOFny HdlMTHYJyRhFLN5Ls4f9XFG/vqtHjwYbJsvjhkZEE8+EIxxl6tUWgF1Kh0BuxseBiJl4EUeM4No 89HFAK+pyqNcqmJWK3qrTVx9uMp/+BljCjFHUhxLW6cm60 X-Google-Smtp-Source: AGHT+IHwR/uNyy4XREluHWJ5+ovrWb65gDmMgcIcohHXB6flzaenya5awGJFYvmrd36ufdEuxYrdaA== X-Received: by 2002:a05:7022:671f:b0:11a:436c:2d56 with SMTP id a92af1059eb24-11c9d7090ffmr11613972c88.2.1764123205776; Tue, 25 Nov 2025 18:13:25 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:25 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 05/14] nvmet: Send an AEN on CCR completion Date: Tue, 25 Nov 2025 18:11:52 -0800 Message-ID: <20251126021250.2583630-6-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Send an AEN to initiator when impacted controller exists. The notification points to CCR log page that initiator can read to check which CCR operation completed. Signed-off-by: Mohamed Khalfella --- drivers/nvme/target/core.c | 27 +++++++++++++++++++++++---- drivers/nvme/target/nvmet.h | 3 ++- include/linux/nvme.h | 3 +++ 3 files changed, 28 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 7dbe9255ff42..60173833c3eb 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -202,7 +202,7 @@ static void nvmet_async_event_work(struct work_struct *= work) nvmet_async_events_process(ctrl); } =20 -void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type, +static void nvmet_add_async_event_locked(struct nvmet_ctrl *ctrl, u8 event= _type, u8 event_info, u8 log_page) { struct nvmet_async_event *aen; @@ -215,12 +215,17 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u= 8 event_type, aen->event_info =3D event_info; aen->log_page =3D log_page; =20 - mutex_lock(&ctrl->lock); list_add_tail(&aen->entry, &ctrl->async_events); - mutex_unlock(&ctrl->lock); =20 queue_work(nvmet_wq, &ctrl->async_event_work); } +void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type, + u8 event_info, u8 log_page) +{ + mutex_lock(&ctrl->lock); + nvmet_add_async_event_locked(ctrl, event_type, event_info, log_page); + mutex_unlock(&ctrl->lock); +} =20 static void nvmet_add_to_changed_ns_log(struct nvmet_ctrl *ctrl, __le32 ns= id) { @@ -1788,6 +1793,18 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_all= oc_ctrl_args *args) } EXPORT_SYMBOL_GPL(nvmet_alloc_ctrl); =20 +static void nvmet_ctrl_notify_ccr(struct nvmet_ctrl *ctrl) +{ + lockdep_assert_held(&ctrl->lock); + + if (nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_CCR_COMPLETE)) + return; + + nvmet_add_async_event_locked(ctrl, NVME_AER_NOTICE, + NVME_AER_NOTICE_CCR_COMPLETED, + NVME_LOG_CCR); +} + static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl) { struct nvmet_subsys *subsys =3D ctrl->subsys; @@ -1801,8 +1818,10 @@ static void nvmet_ctrl_complete_pending_ccr(struct n= vmet_ctrl *ctrl) list_for_each_entry(sctrl, &subsys->ctrls, subsys_entry) { mutex_lock(&sctrl->lock); list_for_each_entry(ccr, &sctrl->ccrs, entry) { - if (ccr->ctrl =3D=3D ctrl) + if (ccr->ctrl =3D=3D ctrl) { + nvmet_ctrl_notify_ccr(sctrl); ccr->ctrl =3D NULL; + } } mutex_unlock(&sctrl->lock); } diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 6c0091b8af8b..7ebcef13be2b 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -44,7 +44,8 @@ * Supported optional AENs: */ #define NVMET_AEN_CFG_OPTIONAL \ - (NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_ANA_CHANGE) + (NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_ANA_CHANGE | \ + NVME_AEN_CFG_CCR_COMPLETE) #define NVMET_DISC_AEN_CFG_OPTIONAL \ (NVME_AEN_CFG_DISC_CHANGE) =20 diff --git a/include/linux/nvme.h b/include/linux/nvme.h index d51883122d65..a145417dccd3 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -863,12 +863,14 @@ enum { NVME_AER_NOTICE_FW_ACT_STARTING =3D 0x01, NVME_AER_NOTICE_ANA =3D 0x03, NVME_AER_NOTICE_DISC_CHANGED =3D 0xf0, + NVME_AER_NOTICE_CCR_COMPLETED =3D 0xf4, }; =20 enum { NVME_AEN_BIT_NS_ATTR =3D 8, NVME_AEN_BIT_FW_ACT =3D 9, NVME_AEN_BIT_ANA_CHANGE =3D 11, + NVME_AEN_BIT_CCR_COMPLETE =3D 20, NVME_AEN_BIT_DISC_CHANGE =3D 31, }; =20 @@ -876,6 +878,7 @@ enum { NVME_AEN_CFG_NS_ATTR =3D 1 << NVME_AEN_BIT_NS_ATTR, NVME_AEN_CFG_FW_ACT =3D 1 << NVME_AEN_BIT_FW_ACT, NVME_AEN_CFG_ANA_CHANGE =3D 1 << NVME_AEN_BIT_ANA_CHANGE, + NVME_AEN_CFG_CCR_COMPLETE =3D 1 << NVME_AEN_BIT_CCR_COMPLETE, NVME_AEN_CFG_DISC_CHANGE =3D 1 << NVME_AEN_BIT_DISC_CHANGE, }; =20 --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC13F30C353 for ; Wed, 26 Nov 2025 02:13:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123209; cv=none; b=tw2ZHD/Ob2eiqAIBVt9y8PH6A0+HKZ+ClwMz1I72e3qb0gf8QnVWKCnLxLElIYusOgnefnwgbp9oVk/AcXRPA6dgw8KLCVNHXl9s7sxHIO0WWwrSmAhQZ2cWHICjakT3SzHbHv4xffrrAhJ7WiljbrCUhhcKAJNoSHWYO2iNgNA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123209; c=relaxed/simple; bh=raOeNl2i6OnBrbh8eY6JByTnAnDmQNOmi6Ih/+dKDK8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qmJZX4NjtVfRHcROb+HZZZAVEySjd+X0Y9cDyhz5sMqM1T4MenV5+RtX3xD30ODmeKIYqCD05Dp3kGT2amgUNaeQoVnhK/H+rmxh5f0T0L9ZA43P3jLOJxo74T8U0t/G6yBc+FdqgMZnz6Ij7CjQI6VDACp5FnKDGwZn1QiDiK4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=VUCBqMO4; arc=none smtp.client-ip=209.85.215.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="VUCBqMO4" Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-bd1b0e2c1eeso4763296a12.0 for ; Tue, 25 Nov 2025 18:13:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123207; x=1764728007; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Fup6CIzZoN1mUgXTOLLZWG3yGagOqWK2m84nC1cywRc=; b=VUCBqMO49cvMSwEwh+nIz7L106Yi320E7mDDO2bcLOV90B6vrLPCK51i5KawyUkxoz IUHgkFR0sNn3BaFzUwLGnJcbVNC12VQcrgrtaDVR18qqb0QWNgl2zU3K5XwCxWDhkBiA /gAZRCHIdqpxkhgWINwBVPuC/Vp37MxOdxdespw4/OHbQYDK/w6OdUNCKcD4AExjaMct 7E/7guQFysOlK5q6LznuKYhARqaRVWGE47InXP30JWABMpMtVnvO68PQrxZf2ymZqau3 5z50JEByroMIThA/ggVYyvq0JKf/x+pEp2S4kiJEHxgj5SxZY4Uj4DEdbdTnaq0XV7z6 Ig9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123207; x=1764728007; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Fup6CIzZoN1mUgXTOLLZWG3yGagOqWK2m84nC1cywRc=; b=j3/97HvqxNlWY8P5/PRGDP/E58r0rTgun98PewFT4RiTzSHwyhlVpVZbNCHLFjzxh9 Llju5QplZWZV5rYlG1C/o8w0b/zH8gCZwWLeQaQJAKjmotOZo62IMLgEoaMw2EOgR2RK ydEOzirfesfThMkcBEyNDv+KTraYYtuoSa63BVznsX64a60AU8z+4xj856vvuJvMhxAM 5c9t0oWCA9huu6dIgTURMhdcEcbv6DRQPTMQLXWBKRisbhAxva7pD4ygfyYQRzlb+QPt QS7aYjY3K06krXmvBi2/Ta+Em8hths/NvdNvLkzhpH/mEDLiKCFfYPzPafvwGuSfxmlJ ob7g== X-Forwarded-Encrypted: i=1; AJvYcCWdgtlHVyvvi/YMj9/JvD/5qsjDFC4nLtPq7spl7xjcXLKcAdUgNwxdFS8Oa0K8urFNPOB1yJlMTtFdTAw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx+J7qjCFkM6qk+OhHFWl2D+3W76kGFisyZ2obQXosFHsjlZ85r 28wwPhadEWqCZKjwc4DFZEJTBDJbFCL5haXLHk7JYX60tzXKUjgqqYyBHgLbiHd4cLU= X-Gm-Gg: ASbGncso8tcLiyQAeVrW/eb4ttHaYyX59/dFggWC3mTQIY87MkkjD9DtvlX6oIObGRl W1B/jezjJxz1nbH9Uzk/ucjonZjsGRmgVZINMzT2mZaEwnz8ddBNCfgioEkHEUD3p8+Tj+t5mTD LYvCv+t0IePw3phW8AmymaQLUG29d8Z8hBRlknSe8c5PHPrM8sDOJqv2U3qWBGKws6G5ZL8RN15 COclUzi+uEtsKZdPWLb2sCQlW6z9lGNdzVVftUEBugdrxxws1EkSUsVvusyZXYc32ArE4ImD2o/ nbPKEvxFsBQyXS9KZai7jr+0/Vutly4UKF8viOc3OORfnpwNY/MUBAHltXNTzI064a4rqbYgBwt PaYN6ql4rGcU6K/l+LIDHUn9LdlQAjghXLkm+SP4CY3K1+Gm1vfdgYZa7i2oxO57ztdHr0YdSP3 V4W1wxa+WXNJknw2CUwnTAMJT+MK1q5ZnZmg== X-Google-Smtp-Source: AGHT+IEGtCJVxQEDpFf3Jd3BrTA4e27R/LfD8qTutalpA0yqSHCpemiGQYzP9RyUO13yMxhultiV+Q== X-Received: by 2002:a05:7022:670f:b0:11a:4016:4491 with SMTP id a92af1059eb24-11c9d84c6f8mr12170250c88.24.1764123206595; Tue, 25 Nov 2025 18:13:26 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:26 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 06/14] nvme: Rapid Path Failure Recovery read controller identify fields Date: Tue, 25 Nov 2025 18:11:53 -0800 Message-ID: <20251126021250.2583630-7-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" TP2028 Rapid path failure added new fileds to controller identify response. Read CIU (Controller Instance Uniquifier), CIRN (Controller Instance Random Number), and CCRL (Cross-Controller Reset Limit) from controller identify response. Expose CIU and CIRN as sysfs attributes so the values can be used directrly by user if needed. TP4129 KATO Corrections and Clarifications defined CQT (Command Quiesce Time) which is used along with KATO (Keep Alive Timeout) to set an upper limite for attempting Cross-Controller Recovery. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/core.c | 5 +++++ drivers/nvme/host/nvme.h | 11 +++++++++++ drivers/nvme/host/sysfs.c | 23 +++++++++++++++++++++++ 3 files changed, 39 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index fa4181d7de73..aa007a7b9606 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3572,12 +3572,17 @@ static int nvme_init_identify(struct nvme_ctrl *ctr= l) ctrl->crdt[1] =3D le16_to_cpu(id->crdt2); ctrl->crdt[2] =3D le16_to_cpu(id->crdt3); =20 + ctrl->ciu =3D id->ciu; + ctrl->cirn =3D le64_to_cpu(id->cirn); + atomic_set(&ctrl->ccr_limit, id->ccrl); + ctrl->oacs =3D le16_to_cpu(id->oacs); ctrl->oncs =3D le16_to_cpu(id->oncs); ctrl->mtfa =3D le16_to_cpu(id->mtfa); ctrl->oaes =3D le32_to_cpu(id->oaes); ctrl->wctemp =3D le16_to_cpu(id->wctemp); ctrl->cctemp =3D le16_to_cpu(id->cctemp); + ctrl->cqt =3D le16_to_cpu(id->cqt); =20 atomic_set(&ctrl->abort_limit, id->acl + 1); ctrl->vwc =3D id->vwc; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 102fae6a231c..5195a9abfadf 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -326,13 +326,17 @@ struct nvme_ctrl { u32 max_zone_append; #endif u16 crdt[3]; + u16 cqt; u16 oncs; u8 dmrl; + u8 ciu; u32 dmrsl; + u64 cirn; u16 oacs; u16 sqsize; u32 max_namespaces; atomic_t abort_limit; + atomic_t ccr_limit; u8 vwc; u32 vs; u32 sgls; @@ -1218,4 +1222,11 @@ static inline bool nvme_multi_css(struct nvme_ctrl *= ctrl) return (ctrl->ctrl_config & NVME_CC_CSS_MASK) =3D=3D NVME_CC_CSS_CSI; } =20 +static inline unsigned long nvme_recovery_timeout_ms(struct nvme_ctrl *ctr= l) +{ + if (ctrl->ctratt & NVME_CTRL_ATTR_TBKAS) + return 3 * ctrl->kato * 1000 + ctrl->cqt; + return 2 * ctrl->kato * 1000 + ctrl->cqt; +} + #endif /* _NVME_H */ diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c index 29430949ce2f..ae36249ad61e 100644 --- a/drivers/nvme/host/sysfs.c +++ b/drivers/nvme/host/sysfs.c @@ -388,6 +388,27 @@ nvme_show_int_function(queue_count); nvme_show_int_function(sqsize); nvme_show_int_function(kato); =20 +static ssize_t nvme_sysfs_uniquifier_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct nvme_ctrl *ctrl =3D dev_get_drvdata(dev); + + return sysfs_emit(buf, "%02x\n", ctrl->ciu); +} +static DEVICE_ATTR(uniquifier, S_IRUGO, nvme_sysfs_uniquifier_show, NULL); + +static ssize_t nvme_sysfs_random_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct nvme_ctrl *ctrl =3D dev_get_drvdata(dev); + + return sysfs_emit(buf, "%016llx\n", ctrl->cirn); +} +static DEVICE_ATTR(random, S_IRUGO, nvme_sysfs_random_show, NULL); + + static ssize_t nvme_sysfs_delete(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -734,6 +755,8 @@ static struct attribute *nvme_dev_attrs[] =3D { &dev_attr_numa_node.attr, &dev_attr_queue_count.attr, &dev_attr_sqsize.attr, + &dev_attr_uniquifier.attr, + &dev_attr_random.attr, &dev_attr_hostnqn.attr, &dev_attr_hostid.attr, &dev_attr_ctrl_loss_tmo.attr, --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8144A30C612 for ; Wed, 26 Nov 2025 02:13:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123210; cv=none; b=ovioQ1yuOOcFRdRBB3QiD3Q/HHceKI4ay2PpI0tZhcDLDyCugKztEs44o7jLiUal19IyrHH3YDJLaeG7kjkgrCHK3Wf2+RruVOPsdeofShGQBIPMHiWUUWkZ747gZyvaAWsU2CBAQLConKaWa78rYMmHmwGfj6yc477qn+5idY8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123210; c=relaxed/simple; bh=gXOjvEboseVtId1uagQgyDPNgelSyKfuK1gcsS48IXk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g/T/BZhxah2x6Rqdyop8JFnn2+66RwSS4LhbogUf3ktNMnI3VaD59sOIdMKNdHF4BZ1ZFXCTmyblWfKsZ7jCM8viSFQmWSkSYqcNUtL0S80LW2KDc4wBWTu+d9Ij6kPqRjTwaf3p0Ij2MBhHKPJ3xTCxPz8ZRykG1fbsOBgrvY4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=J0y4lwFM; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="J0y4lwFM" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-2984dfae0acso96959015ad.0 for ; Tue, 25 Nov 2025 18:13:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123208; x=1764728008; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+I4ImapQBrzyuEizr600XxZNr5XxxSLizDI0ARnZn/M=; b=J0y4lwFM/eYMcnGf7KiI+UYi6qtlpyowdMIYPFdB6rLM+q+xApWVuHFmpW8/0UyMxe uxGSKbSLfqASvM1MBbe88mlq2tqnZ/Ps/RgZpM7Yn/ZOLQl4A7wAufRC1AcTCl7i45N9 XYVHLJwQGQ/pdYZrdPe5J8shkeFn6lImHkSQJkVT5v9k4PguP+Y5ROy5TqkIfemJjYcf C/p7kw7CVCDbcx/Dk181t4lN8OA1LFd62T5r5pZwQkV6YqrrlM+4IZ0aLS7wxuJcx+kO INJMKNoJWEnJ6lQOZaKRArw3xR6t+YwwBY1MtEvJvsrDYaf9FNEfmSaeP3KDB/ssNlei FrKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123208; x=1764728008; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+I4ImapQBrzyuEizr600XxZNr5XxxSLizDI0ARnZn/M=; b=Iu1OCerMbCsnPLTKWZGGZFWi3kdghnFOtGs7kwZKktzD0dQEC1Fpc1hqkr7vH84BXL HmozC380IvNR4jvuqnSPyAwiDzY1qmRTdA+5kqU4nOACid4F99nVc+yaZEsFWNbF+gcC PnY9DQ0g2Hyf52jJNJCCvb9Fgtz7f9GKRxVIeHMK8U4yeCSVJeoDWxBgfc0KvrF3ue0e SWyanmqfrgKv4VTdjQcV2e7PYHm8NvBzd46i6dllAibjusKOPNLulJg2//p2WcsB39V7 8+AYV9BNijWgXdZTV5/hKRAYIUsi95/iiJR7FbDMkDR6BFdUDnFKNccl83BPQGWx2Z/0 jESQ== X-Forwarded-Encrypted: i=1; AJvYcCVfcYU8dFyVBvrUz+Xmmkc89Kbh9E8honBLrOnFv8Y01ZkARTkqt3OuTvND+TfuaAeyW4k6CM45EAY34to=@vger.kernel.org X-Gm-Message-State: AOJu0YwUxJuRD+Ev9AYXHGPKVIhQkgRbp/Ef3XL2WeH8NSkMXi9KCjXB rLVEyFlBzvhbH9OVP5GuoWj/r2efCKG3dQ+IH3ELmCZzVVRccVqTsbA9OfKygzAVLwk= X-Gm-Gg: ASbGncsriX+VP6nFy8N8LSeVqixnt3OuFLZzMLKBMuhM/8/I7nN+614mXpAHu0JnVRv k8BLkxBIpLqrrrxi5s7MrUNlMNvmLDiowhLOqVcRHuKxkceho04nmzM66X6HQ9Fphchh0hWaEnO EA+enFjnw91SAwE8mG17KHUe36PO6ozruikUMWsCH/qnmjRPFTXadTRaGBMGR9X3FeVO0AjNIGA U9h2noRHEyQhIM4kaKMnHTtNCt2a9hIqQ7lUrIaGDdk7nM07vv0aNdk4mV8vICZKntS9etVZZCr bMxphQsu6ZGLcaM6r6Bl9Q6dqKNUvyqvdwMy1mPp7SpjO+jo6Cj+GfPAAenJ+MtCJoPXVpr+ZiO 6P+/JEBSsnCTF2yT25Lp7//wSl1ihwLkkWhtgHMlCT5FYUOngZWNZ1Dx9Rb0cJzwj4n3yjLrqHD HI8okoKx8dSHi+OxqD+YfS0OTaCnKJEQlDwg== X-Google-Smtp-Source: AGHT+IFUalrTLmoXcAJDo3arJsqQ+UxGMbo4/XplRRBL96NzTQ9qPQ+9cU3UaRbqugcq/gua0Spoww== X-Received: by 2002:a05:7022:3d08:b0:11b:9386:7ecf with SMTP id a92af1059eb24-11cbba59308mr4282110c88.44.1764123207411; Tue, 25 Nov 2025 18:13:27 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:27 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 07/14] nvme: Add RECOVERING nvme controller state Date: Tue, 25 Nov 2025 18:11:54 -0800 Message-ID: <20251126021250.2583630-8-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add NVME_CTRL_RECOVERING as a new controller state to be used when impacted controller is being recovered. A LIVE controller enters RECOVERING state when an IO error is encountered. While recovering inflight IOs will not be canceled if they timeout. These IOs will be canceled after recovery finishes. Also, while recovering a controller can not be reset or deleted. This is intentional because reset or delete will result in canceling inflight IOs. When recovery finishes, the impacted controller transitions from RECOVERING state to RESETTING state. Reset codepath takes care of queues teardown and inflight requests cancellation. Note, there is no transition from RECOVERING to RESETTING added to nvme_change_ctrl_state(). The reason is that user should not be allowed to reset or delete a controller that is being recovered. Add NVME_CTRL_RECOVERED controller flag. This flag is set on a controller about to schedule delayed work for time based recovery. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/core.c | 10 ++++++++++ drivers/nvme/host/nvme.h | 2 ++ drivers/nvme/host/sysfs.c | 1 + 3 files changed, 13 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index aa007a7b9606..f5b84bc327d3 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -574,6 +574,15 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, break; } break; + case NVME_CTRL_RECOVERING: + switch (old_state) { + case NVME_CTRL_LIVE: + changed =3D true; + fallthrough; + default: + break; + } + break; case NVME_CTRL_RESETTING: switch (old_state) { case NVME_CTRL_NEW: @@ -761,6 +770,7 @@ blk_status_t nvme_fail_nonready_command(struct nvme_ctr= l *ctrl, if (state !=3D NVME_CTRL_DELETING_NOIO && state !=3D NVME_CTRL_DELETING && state !=3D NVME_CTRL_DEAD && + state !=3D NVME_CTRL_RECOVERING && !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) && !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH)) return BLK_STS_RESOURCE; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 5195a9abfadf..cde427353e0a 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -251,6 +251,7 @@ static inline u16 nvme_req_qid(struct request *req) enum nvme_ctrl_state { NVME_CTRL_NEW, NVME_CTRL_LIVE, + NVME_CTRL_RECOVERING, NVME_CTRL_RESETTING, NVME_CTRL_CONNECTING, NVME_CTRL_DELETING, @@ -275,6 +276,7 @@ enum nvme_ctrl_flags { NVME_CTRL_SKIP_ID_CNS_CS =3D 4, NVME_CTRL_DIRTY_CAPABILITY =3D 5, NVME_CTRL_FROZEN =3D 6, + NVME_CTRL_RECOVERED =3D 7, }; =20 struct nvme_ctrl { diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c index ae36249ad61e..55f907fb6c86 100644 --- a/drivers/nvme/host/sysfs.c +++ b/drivers/nvme/host/sysfs.c @@ -443,6 +443,7 @@ static ssize_t nvme_sysfs_show_state(struct device *dev, static const char *const state_name[] =3D { [NVME_CTRL_NEW] =3D "new", [NVME_CTRL_LIVE] =3D "live", + [NVME_CTRL_RECOVERING] =3D "recovering", [NVME_CTRL_RESETTING] =3D "resetting", [NVME_CTRL_CONNECTING] =3D "connecting", [NVME_CTRL_DELETING] =3D "deleting", --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A8DC30B532 for ; Wed, 26 Nov 2025 02:13:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123211; cv=none; b=SdgbJrLbjlSmTT9nb8cyo+7iaCpFl6OLBZS54OFEk2015540lIellzdO8vfts9q1oMW31oS/E1hD7EjxUC8aOv6xjHkxmkaEjTaTg0yzFAYHHcL0aM/MHechIYYw4KVk/yG20AQNZhhvnAlgq6jRr63lpCVvOQlcNDVaA5eVVV0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123211; c=relaxed/simple; bh=HD/8KoCNx5JfurJIhULd/fSAj0kofSJYBFkRsoJL1k0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uwAAW2kBf32800BK9x9O8Dh60H83SOzZZrgkpSN19HFQzyuCcdUun/OhahAy/emLsvSPQ1zyq1P5nzPXJwYrJhSz5T3kCgLPR5u/5yQFI+i8jxSJaZqtoNY40NlZO37Y/3C9P/uHJI1rjzD7sOawgG9TjFdcaY4lERI+MoVVgy8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=AdbU/rKd; arc=none smtp.client-ip=209.85.215.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="AdbU/rKd" Received: by mail-pg1-f177.google.com with SMTP id 41be03b00d2f7-bc1f6dfeb3dso3690869a12.1 for ; Tue, 25 Nov 2025 18:13:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123208; x=1764728008; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CjnV0mSDq9jYPBsBREKX0CT54nJKxP3oh1Vdl0zWWFg=; b=AdbU/rKd/8J9+PLY6XT2HaEtBcffFOS96/b4SnhOMKZtLKujeMkkNrnhBkBagKtYzt D+ptyg6YGKfec7kOYy/quLOyWsJKfCPDKVZHhhzZqweDsr3Kl6UXAEzGSh1nnNyIgTSz 2UCfPzmEnjflQ6ueX4LD8UwVBeCQ4AKCv5umiJy9IqboeuZ5kk86w4UJyymsjG3NxTsU WpgAtmuUyP/0fUbZzP1Zp4n+OhV5MP12vG03uEjlvpW9DOa8kS1+OfLIH6M2i337kI0z 8eV6gA0HrDuXyzqPPEyeTCIj2k3Ai87F3/tEXAqSA4w57NkGxcBMJaOghgbUs19i+G2Y 6/JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123208; x=1764728008; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=CjnV0mSDq9jYPBsBREKX0CT54nJKxP3oh1Vdl0zWWFg=; b=vPWoYvmUEbctNdJ7OjP1I5y098Qh4roSERFAOr5R5CffG19mXJL5nquayyc4abHrc+ I4D8JGqmjBgifOe2xGjSwoowIj/LCD+WUwcsmcWhy2It9IX0wxid6ommKvH99My+qhw5 +Q2EDBpBRKhB/briTpIP9tpuHZLaMCqki+xSryQtAFq2WZqVUNtfUZdi/uJm5i5rZC6v 6U3KnqiafiHI9dNaQojKvZqrDKG4dHGh/WvERdIYMfI3iM+Dfa9WmKVcqrREcK8utKju czNqKs9ktL7pEcP4JMKV42pU8mPSyfcDX76dWTizAs7GWwrQolbFhvLzqMgPvOSZtLux PfZQ== X-Forwarded-Encrypted: i=1; AJvYcCVk6ROz7DGOO3p5Q3SHOPtR/s7k+lNuC3als6mG4BNfEUOH/ZuFYpHEPf0QJpzelYTzeyQOmj13Q4kZGzk=@vger.kernel.org X-Gm-Message-State: AOJu0YwLXT/Yrv1NzTcC1K/BleNaaq23Os/I5q9Vp2vdSBKfcHnLgdwy 5DPMckJreTjLMm+jy10a1jfHrPqe8V9pBkkLr4PfN1o5Q+y+//GuL+BEBz0sUdA9yXg= X-Gm-Gg: ASbGncukj1YhtL9GUedLgtaStgIFdS3uQgNWtDYOC2sYnKn3KcDiP6afcUa5zL2amfx RnnCg1/L+9QZn2lpfmzYwqtKvcwGlYoYiXJf1k2sXX7jC1W6y92Mf6gbx7LmOgPO8xt7jr0Urac ST1UbFzhxHP2ZfE/HqefWjIHTT3jZaDE9pLEN6QccwX2rW1R9+Ykj0IvTd7PdffQ5aZw6W/ZVkx rFQ5waYw26KWfX7DIuxa9lSzKG2mJCIPfkb5T0kUcq/sI6+O8GOWmlaTa8XS850BsrUr5FJqZsl /cCOCM+E4w8mZ+4UrqpS3uk7JGl+Bn6b6yN5nxrf8ITeLjM0ysk5tJCwA2FgWVx8MAdeu9ZyS9p BK9CE9AShdQRiKTPH5LhdzLZDOSCTvNCd9/nEOkQGJnbfKJaEmatimirIeLRZwuOvZfPPSD2kW2 VANwglQb9bEfpx2iC/45G9BBnia1X2t/h0sw== X-Google-Smtp-Source: AGHT+IEhnXE+QCydeFnkH60Wu36pjMR+ctQOkgAui/pVLevVZNZbL7OAINeVbFcTTLvp5xLjhA13mw== X-Received: by 2002:a05:7301:5f83:b0:2a4:6784:d99 with SMTP id 5a478bee46e88-2a941890ca7mr3944316eec.31.1764123208180; Tue, 25 Nov 2025 18:13:28 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:27 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 08/14] nvme: Implement cross-controller reset recovery Date: Tue, 25 Nov 2025 18:11:55 -0800 Message-ID: <20251126021250.2583630-9-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A host that has more than one path connecting to an nvme subsystem typically has an nvme controller associated with every path. This is mostly applicable to nvmeof. If one path goes down, inflight IOs on that path should not be retried immediately on another path because this could lead to data corruption as described in TP4129. TP8028 defines cross-controller reset mechanism that can be used by host to terminate IOs on the failed path using one of the remaining healthy paths. Only after IOs are terminated, or long enough time passes as defined by TP4129, inflight IOs should be retried on another path. Implement core cross-controller reset shared logic to be used by the transports. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/constants.c | 1 + drivers/nvme/host/core.c | 133 ++++++++++++++++++++++++++++++++++ drivers/nvme/host/nvme.h | 10 +++ 3 files changed, 144 insertions(+) diff --git a/drivers/nvme/host/constants.c b/drivers/nvme/host/constants.c index dc90df9e13a2..f679efd5110e 100644 --- a/drivers/nvme/host/constants.c +++ b/drivers/nvme/host/constants.c @@ -46,6 +46,7 @@ static const char * const nvme_admin_ops[] =3D { [nvme_admin_virtual_mgmt] =3D "Virtual Management", [nvme_admin_nvme_mi_send] =3D "NVMe Send MI", [nvme_admin_nvme_mi_recv] =3D "NVMe Receive MI", + [nvme_admin_cross_ctrl_reset] =3D "Cross Controller Reset", [nvme_admin_dbbuf] =3D "Doorbell Buffer Config", [nvme_admin_format_nvm] =3D "Format NVM", [nvme_admin_security_send] =3D "Security Send", diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f5b84bc327d3..f38b70ca9cee 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -554,6 +554,138 @@ void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl) } EXPORT_SYMBOL_GPL(nvme_cancel_admin_tagset); =20 +static struct nvme_ctrl *nvme_find_ccr_ctrl(struct nvme_ctrl *ictrl, + u32 min_cntlid) +{ + struct nvme_subsystem *subsys =3D ictrl->subsys; + struct nvme_ctrl *sctrl; + unsigned long flags; + + mutex_lock(&nvme_subsystems_lock); + list_for_each_entry(sctrl, &subsys->ctrls, subsys_entry) { + if (sctrl->cntlid < min_cntlid) + continue; + + if (atomic_dec_if_positive(&sctrl->ccr_limit) < 0) + continue; + + spin_lock_irqsave(&sctrl->lock, flags); + if (sctrl->state !=3D NVME_CTRL_LIVE) { + spin_unlock_irqrestore(&sctrl->lock, flags); + atomic_inc(&sctrl->ccr_limit); + continue; + } + + /* + * We got a good candidate source controller that is locked and + * LIVE. However, no guarantee sctrl will not be deleted after + * sctrl->lock is released. Get a ref of both sctrl and admin_q + * so they do not disappear until we are done with them. + */ + WARN_ON_ONCE(!blk_get_queue(sctrl->admin_q)); + nvme_get_ctrl(sctrl); + spin_unlock_irqrestore(&sctrl->lock, flags); + goto found; + } + sctrl =3D NULL; +found: + mutex_unlock(&nvme_subsystems_lock); + return sctrl; +} + +static int nvme_issue_wait_ccr(struct nvme_ctrl *sctrl, struct nvme_ctrl *= ictrl) +{ + unsigned long flags, tmo, remain; + struct nvme_ccr_entry ccr =3D { }; + union nvme_result res =3D { 0 }; + struct nvme_command c =3D { }; + u32 result; + int ret =3D 0; + + init_completion(&ccr.complete); + ccr.ictrl =3D ictrl; + + spin_lock_irqsave(&sctrl->lock, flags); + list_add_tail(&ccr.list, &sctrl->ccrs); + spin_unlock_irqrestore(&sctrl->lock, flags); + + c.ccr.opcode =3D nvme_admin_cross_ctrl_reset; + c.ccr.ciu =3D ictrl->ciu; + c.ccr.icid =3D cpu_to_le16(ictrl->cntlid); + c.ccr.cirn =3D cpu_to_le64(ictrl->cirn); + ret =3D __nvme_submit_sync_cmd(sctrl->admin_q, &c, &res, + NULL, 0, NVME_QID_ANY, 0); + if (ret) + goto out; + + result =3D le32_to_cpu(res.u32); + if (result & 0x01) /* Immediate Reset */ + goto out; + + tmo =3D msecs_to_jiffies(max(ictrl->cqt, ictrl->kato * 1000)); + remain =3D wait_for_completion_timeout(&ccr.complete, tmo); + if (!remain) + ret =3D -EAGAIN; +out: + spin_lock_irqsave(&sctrl->lock, flags); + list_del(&ccr.list); + spin_unlock_irqrestore(&sctrl->lock, flags); + return ccr.ccrs =3D=3D 1 ? 0 : ret; +} + +unsigned long nvme_recover_ctrl(struct nvme_ctrl *ictrl) +{ + unsigned long deadline, now, timeout; + struct nvme_ctrl *sctrl; + u32 min_cntlid =3D 0; + int ret; + + timeout =3D nvme_recovery_timeout_ms(ictrl); + dev_info(ictrl->device, "attempting CCR, timeout %lums\n", timeout); + + now =3D jiffies; + deadline =3D now + msecs_to_jiffies(timeout); + while (time_before(now, deadline)) { + sctrl =3D nvme_find_ccr_ctrl(ictrl, min_cntlid); + if (!sctrl) { + /* CCR failed, switch to time-based recovery */ + return deadline - now; + } + + ret =3D nvme_issue_wait_ccr(sctrl, ictrl); + atomic_inc(&sctrl->ccr_limit); + + if (!ret) { + dev_info(ictrl->device, "CCR succeeded using %s\n", + dev_name(sctrl->device)); + blk_put_queue(sctrl->admin_q); + nvme_put_ctrl(sctrl); + return 0; + } + + /* Try another controller */ + min_cntlid =3D sctrl->cntlid + 1; + blk_put_queue(sctrl->admin_q); + nvme_put_ctrl(sctrl); + now =3D jiffies; + } + + dev_info(ictrl->device, "CCR reached timeout, call it done\n"); + return 0; +} +EXPORT_SYMBOL_GPL(nvme_recover_ctrl); + +void nvme_end_ctrl_recovery(struct nvme_ctrl *ctrl) +{ + unsigned long flags; + + spin_lock_irqsave(&ctrl->lock, flags); + WRITE_ONCE(ctrl->state, NVME_CTRL_RESETTING); + wake_up_all(&ctrl->state_wq); + spin_unlock_irqrestore(&ctrl->lock, flags); +} +EXPORT_SYMBOL_GPL(nvme_end_ctrl_recovery); + bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, enum nvme_ctrl_state new_state) { @@ -5108,6 +5240,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct dev= ice *dev, =20 mutex_init(&ctrl->scan_lock); INIT_LIST_HEAD(&ctrl->namespaces); + INIT_LIST_HEAD(&ctrl->ccrs); xa_init(&ctrl->cels); ctrl->dev =3D dev; ctrl->ops =3D ops; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index cde427353e0a..1f8937fce9a7 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -279,6 +279,13 @@ enum nvme_ctrl_flags { NVME_CTRL_RECOVERED =3D 7, }; =20 +struct nvme_ccr_entry { + struct list_head list; + struct completion complete; + struct nvme_ctrl *ictrl; + u8 ccrs; +}; + struct nvme_ctrl { bool comp_seen; bool identified; @@ -296,6 +303,7 @@ struct nvme_ctrl { struct blk_mq_tag_set *tagset; struct blk_mq_tag_set *admin_tagset; struct list_head namespaces; + struct list_head ccrs; struct mutex namespaces_lock; struct srcu_struct srcu; struct device ctrl_device; @@ -805,6 +813,8 @@ blk_status_t nvme_host_path_error(struct request *req); bool nvme_cancel_request(struct request *req, void *data); void nvme_cancel_tagset(struct nvme_ctrl *ctrl); void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl); +unsigned long nvme_recover_ctrl(struct nvme_ctrl *ctrl); +void nvme_end_ctrl_recovery(struct nvme_ctrl *ctrl); bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl, enum nvme_ctrl_state new_state); int nvme_disable_ctrl(struct nvme_ctrl *ctrl, bool shutdown); --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F381D30CD99 for ; Wed, 26 Nov 2025 02:13:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123211; cv=none; b=mG+9t9cXGPl7gByN0a9vIdqbe2hpUlCfC7VL0iRhaP2rUzhlFmBi/BTum1DAhf6nvNn1KkJTV6uPperteDYbLKlF+Dvh7aweg+NtRXGwLoTuneZ9MoVYBYMYy0yI1ljsRCUk3FvJfdmV/K0tTZKssrMvhYKAK4fSx7Cd4CNts18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123211; c=relaxed/simple; bh=3f/b6hnGZfy4uyNY5Qu4l8NTfQAlJ5kTJMlI0Hbk3xc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JEuBiCKLK5Wxh/g7XUutxpMsoVT4KNgzykrkvSbcKLGiSa6D4a6twtTOql3+mMDBNLIJ4bH7FazmOpLHgfAoPmfl+VEgCwMAfLA8JjaEqephlxlxsE7Ku1Okrn6LRYeE6ILw1dmmNV5NOsEKFi45vuhMaR7yXtSvcCVwCCWrM5o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=F4OxSaOg; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="F4OxSaOg" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-7bab7c997eeso6858808b3a.0 for ; Tue, 25 Nov 2025 18:13:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123209; x=1764728009; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=exbjro8ZxYsk67jB8sE9mQJiuAHlrcOp4NnBqCERfJk=; b=F4OxSaOgMNzPDUTUL9S+chRtFShBRlV/S7Uw673Y2QXXHslvbGvIYUkscwFKHZTDUW KZ+khK2aCyUNYhrKZ4hDy49gGCZYlP9HRZ32gYkz5i/D/xh35YwJxXq7HkALxM+cbWD3 2huXnM988Uuv9R8j1dHgYjZO7lus9Ehy96iLzrwDWgzodKTJSOkhVU7Ed6Aw1fQnALk4 8e9VNCau19xqnQI6L5xqDoCcLOIE8jZ6MFYUSppkLRqHwiTHbzgu/FGcI/v72QEAdaSh 7OUZX9kptNHYrmMX1FpoWJ0oJZ+EuW03I7hN9NqNGKBwLXw0a7PM3XWcyVbuDs9iM905 36sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123209; x=1764728009; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=exbjro8ZxYsk67jB8sE9mQJiuAHlrcOp4NnBqCERfJk=; b=dyQW79clV/DjaqBPVzK+xKCrdKXnB+KfrDg5C+kDkDmjdsSUJ3ia0QFgPC+/IH7J5W DIn0xaPfjzCNvyiGKZDwvFiXNDYgznEQ+G1lK1ZYwRtZ0NihxbZFeMtVYPPF64RoPnQK taa3BYbLoqQmotrWvhpOBkzvPnmtaQW93CsMkuMfRcmNC8qHLGnBOSFPaXi+crFEKHsE Usm2o1aEEdblEXWTtF9r6+Sk/YUj7zPbqs2Hdysh8WYav0AY5Sk+4TEN82CnKThc75ff hjK+68uF27nVoc5Dlet+9qrhlGnHqN0c/hC7iZzB4guV8+sXDZq6c+ogVSz+PWIQ2RdL eaow== X-Forwarded-Encrypted: i=1; AJvYcCV0kyRtS0/TqsFUH1Iy5arrTOTA0FGYw081CRDwbcesGAq4vSswvzBNDhOql3jN3LgOUcKNt3eek8GdPRU=@vger.kernel.org X-Gm-Message-State: AOJu0Yz9xynLE2CKueUGoxXbkfwy/rZkCWwZlzJJIhKCBSRGgE/f1Bvh 9rtdGEJvg9sjy/8opUNMdejsnkFJpMXRVKVtG04r/TuVALWp66LNeOjwnZJkllFLvb8= X-Gm-Gg: ASbGncvF2O0CzYI2WUNA/vd30ROcpiwFpTJEDdZotF5j8Ory0chu159bl+i2mErzBS1 38X4lWBqjhplhXx9N4TvXFKPASOHKs9KVOlPeo5VgDOCTAhniuTJw+blkJewI+0wPf/4WqoccWh UchV9PdyjHDy7+IuG95zFOyRhdqqadZBNThhTI5i+abZQ5u0jTw2lfVBbaLsi+UkpJZZwsOSdzE 1gRdfcOe30vACyPUOs5CZfFix5Viu4pWb3HIdFNbJauZsUt/vLazrQyUPWWFPnRdWjZ7PSCstch WLRIOy9KEh0lA/BRoE6sAYkQ2n3QTzfYZEa543ewujFAVLZ5B1euFNb/8PqAUx8U8RGBnHEH8kh NNWESHad8Qlvns25sD8UJ/ffX1h97rHNY7CozTj2jipW52teKj/PnM8DXwjDHv3eiEdYUNlOs0C kQg757pREWItH6LOjWs7oGujHwp2msjIvmYxrIPRu1FvOG X-Google-Smtp-Source: AGHT+IE7adP0QSTlX5R/fE1HuvFPh7U731vSvH+n5NFHCnbsls7zofkCi5K19Xzrg3IFSrz0CdjnZQ== X-Received: by 2002:a05:701b:2803:b0:11b:2138:476a with SMTP id a92af1059eb24-11c9d8539eamr9245799c88.27.1764123208925; Tue, 25 Nov 2025 18:13:28 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:28 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 09/14] nvme: Implement cross-controller reset completion Date: Tue, 25 Nov 2025 18:11:56 -0800 Message-ID: <20251126021250.2583630-10-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An nvme source controller that issues CCR command expects to receive an NVME_AER_NOTICE_CCR_COMPLETED when pending CCR succeeds or fails. Add sctrl->ccr_work to read NVME_LOG_CCR logpage and wakeup any thread waiting on CCR completion. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/core.c | 49 +++++++++++++++++++++++++++++++++++++++- drivers/nvme/host/nvme.h | 1 + 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f38b70ca9cee..467754e77a2d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1894,7 +1894,8 @@ EXPORT_SYMBOL_GPL(nvme_set_queue_count); =20 #define NVME_AEN_SUPPORTED \ (NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_FW_ACT | \ - NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_DISC_CHANGE) + NVME_AEN_CFG_ANA_CHANGE | NVME_AEN_CFG_CCR_COMPLETE | \ + NVME_AEN_CFG_DISC_CHANGE) =20 static void nvme_enable_aen(struct nvme_ctrl *ctrl) { @@ -4860,6 +4861,47 @@ static void nvme_get_fw_slot_info(struct nvme_ctrl *= ctrl) kfree(log); } =20 +static void nvme_ccr_work(struct work_struct *work) +{ + struct nvme_ctrl *ctrl =3D container_of(work, struct nvme_ctrl, ccr_work); + struct nvme_ccr_entry *ccr; + struct nvme_ccr_log_entry *entry; + struct nvme_ccr_log *log; + unsigned long flags; + int ret, i; + + log =3D kmalloc(sizeof(*log), GFP_KERNEL); + if (!log) + return; + + ret =3D nvme_get_log(ctrl, 0, NVME_LOG_CCR, 0x01, + 0x00, log, sizeof(*log), 0); + if (ret) + goto out; + + spin_lock_irqsave(&ctrl->lock, flags); + for (i =3D 0; i < le16_to_cpu(log->ne); i++) { + entry =3D &log->entries[i]; + if (entry->ccrs =3D=3D 0) /* skip in progress entries */ + continue; + + list_for_each_entry(ccr, &ctrl->ccrs, list) { + struct nvme_ctrl *ictrl =3D ccr->ictrl; + + if (ictrl->cntlid !=3D le16_to_cpu(entry->icid) || + ictrl->ciu !=3D entry->ciu) + continue; + + /* Complete matching entry */ + ccr->ccrs =3D entry->ccrs; + complete(&ccr->complete); + } + } + spin_unlock_irqrestore(&ctrl->lock, flags); +out: + kfree(log); +} + static void nvme_fw_act_work(struct work_struct *work) { struct nvme_ctrl *ctrl =3D container_of(work, @@ -4936,6 +4978,9 @@ static bool nvme_handle_aen_notice(struct nvme_ctrl *= ctrl, u32 result) case NVME_AER_NOTICE_DISC_CHANGED: ctrl->aen_result =3D result; break; + case NVME_AER_NOTICE_CCR_COMPLETED: + queue_work(nvme_wq, &ctrl->ccr_work); + break; default: dev_warn(ctrl->device, "async event result %08x\n", result); } @@ -5126,6 +5171,7 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl) nvme_stop_failfast_work(ctrl); flush_work(&ctrl->async_event_work); cancel_work_sync(&ctrl->fw_act_work); + cancel_work_sync(&ctrl->ccr_work); if (ctrl->ops->stop_ctrl) ctrl->ops->stop_ctrl(ctrl); } @@ -5247,6 +5293,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct dev= ice *dev, ctrl->quirks =3D quirks; ctrl->numa_node =3D NUMA_NO_NODE; INIT_WORK(&ctrl->scan_work, nvme_scan_work); + INIT_WORK(&ctrl->ccr_work, nvme_ccr_work); INIT_WORK(&ctrl->async_event_work, nvme_async_event_work); INIT_WORK(&ctrl->fw_act_work, nvme_fw_act_work); INIT_WORK(&ctrl->delete_work, nvme_delete_ctrl_work); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 1f8937fce9a7..3f5a0722304d 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -366,6 +366,7 @@ struct nvme_ctrl { struct nvme_effects_log *effects; struct xarray cels; struct work_struct scan_work; + struct work_struct ccr_work; struct work_struct async_event_work; struct delayed_work ka_work; struct delayed_work failfast_work; --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A48B930C613 for ; Wed, 26 Nov 2025 02:13:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123212; cv=none; b=lmY/bXIdIm3huMqm0zk1rtgQ8yNHGujPHTEFznMYjpmV98xUur0+SH+xGRaUeBxOQ3mMe83g+sRK6BAqaxBu7aP5kXmDsOrjhpuYv30MWlnz9gMGjYtyZPigWNGUQbFv9sB4lKMLHxSvEkdwDW4/AMvmC8Iyh1gvRF5NbpOmN4U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123212; c=relaxed/simple; bh=ALTopKS1tprmQPNMJ2ZcDp4mJw2/zjNpJN9hlFIgCRM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=spnNfeunYUxtMB4Wdxvu2rfBtMfl3csjZ0cP+AokeaSJ6/gfKk/51MzQGtChhAxuLnaUTgAymcsWtiMxTcIxtsg4k8rgpjUgczKz3L14bh/EKP0gmQ8/T8FFY3JtKICLOr3oW1Mn53ljDmDUzzXzKoVy+7Ha7g0rxo6xfIFSGUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=BNdSd9O0; arc=none smtp.client-ip=209.85.210.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="BNdSd9O0" Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-7ad1cd0db3bso5354486b3a.1 for ; Tue, 25 Nov 2025 18:13:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123210; x=1764728010; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BQmj5nXyQdvOjMIGkVkElVH5J/B/tLkJ1kT/MLrqiIw=; b=BNdSd9O0MYVkFapbdpas0+qHTOt304S6qafbc8DlgoaoidcfCG+x2NdOmZdfvHVn9B +LFh8TDVWqXAHSEjZh1l+o1iVbWOmVArLn6Pp3Ai2J+C3XEzM4eaKHe3kdWljoERZaMV WRSe0E0jPUhNGsRZMNe833ivyXXx/Marenilka59PwgSwafOHnA7aZSSMViO0F/rSFFt hydsp91mk0XUIl/Y4UIkNTdkvD+WOpuMU1axr3dwtjpbZ5oGW1li5pvDgpSxtDYzpTti RYdaQgnky+nHLlL7kev8S77iOHADVaKCguxydvfskL4w0UCmdO5jgZxjmk58DiQ7HQjY 9HSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123210; x=1764728010; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BQmj5nXyQdvOjMIGkVkElVH5J/B/tLkJ1kT/MLrqiIw=; b=a2/P5DLKKpxAUJRwXoYc+O3ydwiqU4lnIrofLDHEkQxVtUltZeUmbF2TsE7CSr+vjK p8QVFyg61SQHXo9e5WQXxBEWGshO7qqTelSDUUWg1rm3EM31+vl6WfIjteEF28eHv1Gd +AtsKLLr/d7nAdeQHcyTSdw0yJYDErpAF7N3XfyJFMlsMp9H0zzVOquNMLFXPFVzdW6f GnKStnD06agiw2cBg344UcmRvgJNfCy3BToAXkDDvI+sgXTelDaxsodXsl/AAydjWi85 /EYuXCLGfvUJKOFtJrkxYUEdY0+qspMXj3kHoLTrDpn8qk1IvvfdCZGZw8ANZbiW65wk tb1A== X-Forwarded-Encrypted: i=1; AJvYcCXko3DQfek7mZKdU/audeBSByZsJC176Z/UGByC7Wu0P5WoUEeFP6EvkrGtXWm6pPQA56m7cIEbTQLzhOE=@vger.kernel.org X-Gm-Message-State: AOJu0YytiByhaRGOz/bNE1bdNWr6ZcsnhVcZ1E54yUE2Um6hm0jVQWK7 jEmepyZ5P5ggDLZJgLkmKe++MdlZ8ItJgGoRINFo7k8pPLZsny2bqBRFZPj6wuDnbq8= X-Gm-Gg: ASbGncu7Jwb2K9trZ/8tvLQc7JG31LLiN3t2ibj/9qpr+on5nRVn5xspxksebFlcFx7 EY2GXLm2KmrvLAEfiQGhWm/Eb3FRMq152YLNMG0ohwBgjlgzKy1W6v/zGpjWqKgbQDXcZE5wF9f 47foXTYd1Utm/fliCoBvGr2PQDBWL1EBz7vv+38LvPo5WhTAP3RLtXCD/lEqD/DOChFEdN2ISMy IrKKpVpPXaHHP+ACxnigvAnNXU91vJj7Kkfbh7vO8qRPvHGqVjvt+zKPufIqmbVlzkKi6VlfjB+ 6n9Y2Vdy50dgXwtm5RsBs0OPghCZZ+ve06FC2HeqfHnFkSkePGIWgqnqiYRKDZD7c5RWKOBbHyZ uj0QMn9d/d5X5F8d0pi8tqqif0JVBbrO57Qdl61AXMQZMvFuaZ/4hYQbSz0t3lJ2B9CZip5QdkK 1FPkHsTTnNiHjIYMKSoWhws5NQ2aMLV1JyxcH1+V0Y5CTv X-Google-Smtp-Source: AGHT+IFZdLGKDcA+r7MKkSNJWp8uXhLGGJGNxCq8RRfg9SzsFBF2+PRo74PS1C/YdROofIqV0sX2Ag== X-Received: by 2002:a05:7022:6621:b0:11b:9386:8273 with SMTP id a92af1059eb24-11cbba6ec84mr3823215c88.48.1764123209644; Tue, 25 Nov 2025 18:13:29 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:29 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 10/14] nvme-tcp: Use CCR to recover controller that hits an error Date: Tue, 25 Nov 2025 18:11:57 -0800 Message-ID: <20251126021250.2583630-11-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An alive nvme controller that hits an error now will move to RECOVERING state instead of RESETTING state. In RECOVERING state ctrl->err_work will attempt to use cross-controller recovery to terminate inflight IOs on the controller. If CCR succeeds, then switch to RESETTING state and continue error recovery as usuall by tearing down controller and attempt reconnecting to target. If CCR fails, then the behavior of recovery depends on whether CQT is supported or not. If CQT is supported, switch to time-based recovery by holding inflight IOs until it is safe for them to be retried. If CQT is not supported proceed to retry requests immediately, as the code currently does. To support implementing time-based recovery turn ctrl->err_work into delayed work. Update nvme_tcp_timeout() to not complete inflight IOs while controller in RECOVERING state. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/tcp.c | 52 +++++++++++++++++++++++++++++++++++------ 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 9a96df1a511c..ec9a713490a9 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -193,7 +193,7 @@ struct nvme_tcp_ctrl { struct sockaddr_storage src_addr; struct nvme_ctrl ctrl; =20 - struct work_struct err_work; + struct delayed_work err_work; struct delayed_work connect_work; struct nvme_tcp_request async_req; u32 io_queues[HCTX_MAX_TYPES]; @@ -611,11 +611,12 @@ static void nvme_tcp_init_recv_ctx(struct nvme_tcp_qu= eue *queue) =20 static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl) { - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RECOVERING) && + !nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) return; =20 dev_warn(ctrl->device, "starting error recovery\n"); - queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work); + queue_delayed_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work, 0); } =20 static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue, @@ -2470,12 +2471,48 @@ static void nvme_tcp_reconnect_ctrl_work(struct wor= k_struct *work) nvme_tcp_reconnect_or_remove(ctrl, ret); } =20 +static int nvme_tcp_recover_ctrl(struct nvme_ctrl *ctrl) +{ + unsigned long rem; + + if (test_and_clear_bit(NVME_CTRL_RECOVERED, &ctrl->flags)) { + dev_info(ctrl->device, "completed time-based recovery\n"); + goto done; + } + + rem =3D nvme_recover_ctrl(ctrl); + if (!rem) + goto done; + + if (!ctrl->cqt) { + dev_info(ctrl->device, + "CCR failed, CQT not supported, skip time-based recovery\n"); + goto done; + } + + dev_info(ctrl->device, + "CCR failed, switch to time-based recovery, timeout =3D %ums\n", + jiffies_to_msecs(rem)); + set_bit(NVME_CTRL_RECOVERED, &ctrl->flags); + queue_delayed_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work, rem); + return -EAGAIN; + +done: + nvme_end_ctrl_recovery(ctrl); + return 0; +} + static void nvme_tcp_error_recovery_work(struct work_struct *work) { - struct nvme_tcp_ctrl *tcp_ctrl =3D container_of(work, + struct nvme_tcp_ctrl *tcp_ctrl =3D container_of(to_delayed_work(work), struct nvme_tcp_ctrl, err_work); struct nvme_ctrl *ctrl =3D &tcp_ctrl->ctrl; =20 + if (nvme_ctrl_state(ctrl) =3D=3D NVME_CTRL_RECOVERING) { + if (nvme_tcp_recover_ctrl(ctrl)) + return; + } + if (nvme_tcp_key_revoke_needed(ctrl)) nvme_auth_revoke_tls_key(ctrl); nvme_stop_keep_alive(ctrl); @@ -2545,7 +2582,7 @@ static void nvme_reset_ctrl_work(struct work_struct *= work) =20 static void nvme_tcp_stop_ctrl(struct nvme_ctrl *ctrl) { - flush_work(&to_tcp_ctrl(ctrl)->err_work); + flush_delayed_work(&to_tcp_ctrl(ctrl)->err_work); cancel_delayed_work_sync(&to_tcp_ctrl(ctrl)->connect_work); } =20 @@ -2640,6 +2677,7 @@ static enum blk_eh_timer_return nvme_tcp_timeout(stru= ct request *rq) { struct nvme_tcp_request *req =3D blk_mq_rq_to_pdu(rq); struct nvme_ctrl *ctrl =3D &req->queue->ctrl->ctrl; + enum nvme_ctrl_state state =3D nvme_ctrl_state(ctrl); struct nvme_tcp_cmd_pdu *pdu =3D nvme_tcp_req_cmd_pdu(req); struct nvme_command *cmd =3D &pdu->cmd; int qid =3D nvme_tcp_queue_id(req->queue); @@ -2649,7 +2687,7 @@ static enum blk_eh_timer_return nvme_tcp_timeout(stru= ct request *rq) rq->tag, nvme_cid(rq), pdu->hdr.type, cmd->common.opcode, nvme_fabrics_opcode_str(qid, cmd), qid); =20 - if (nvme_ctrl_state(ctrl) !=3D NVME_CTRL_LIVE) { + if (state !=3D NVME_CTRL_LIVE && state !=3D NVME_CTRL_RECOVERING) { /* * If we are resetting, connecting or deleting we should * complete immediately because we may block controller @@ -2903,7 +2941,7 @@ static struct nvme_tcp_ctrl *nvme_tcp_alloc_ctrl(stru= ct device *dev, =20 INIT_DELAYED_WORK(&ctrl->connect_work, nvme_tcp_reconnect_ctrl_work); - INIT_WORK(&ctrl->err_work, nvme_tcp_error_recovery_work); + INIT_DELAYED_WORK(&ctrl->err_work, nvme_tcp_error_recovery_work); INIT_WORK(&ctrl->ctrl.reset_work, nvme_reset_ctrl_work); =20 if (!(opts->mask & NVMF_OPT_TRSVCID)) { --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65A2E30DEA7 for ; Wed, 26 Nov 2025 02:13:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123213; cv=none; b=GThX3axfvHDP/WoXSrrehkKg6UWPmcVefr44S6Kh7ZXQnclknycm/qquAJEsgo8+bN045AB4qGgxALolFcdevzrm9JixO7vpWi9gyIizl2OgUCGiSRHT5SlHiKKDl5uCKVeS/LOPwhKYm1JbFTnHqemPJSLwqebDzwmOtzx03k0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123213; c=relaxed/simple; bh=p3pupVGjzaWfxhmPxwzf5wpAj9ggyhiFqq9vux5ImOk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GjjfQq7YIzbR4iWyKCZTsl1ZwNQzCCFY0SiC9Kcsw4fWRwLXj5jNiOJwwxkN0rURsjkei/N10eHGkFkkinh3r3xW+4XMMoArZ8FAPW/PLo1tpbjNFHCNCW6S42jYebhCMwHhy3rvQaNVH82dHEEc6RXBwzKd0kMkE+J7wmAV3t8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=YQsZAPBr; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="YQsZAPBr" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-295548467c7so72598515ad.2 for ; Tue, 25 Nov 2025 18:13:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123211; x=1764728011; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aH2GP6wi0yyOImZ0Dp3pp8w4EIT67hcH8CtToZuUsos=; b=YQsZAPBrqNXwyF4PxP4sScKmMLF8fviP4rbT/Yum7nyKQ7Jy2sY0tvPYe8wH62LbBA 0usfE3HTLYLfGWzRPAzZsH4E0FQuyRfOBialHeLg/Nn3P98eNmvglwMLhfjJzZ+YUbTf hyc0MUSoZ4u0XKHTPotpBXncNWZQhNyYpsfnb9YjtZyRbino6qq1f7UKm8FMy3RtxPth HgFozgOOzcANlGfuh0LOst6RhpAlQsJlZv9xuMU2E+yyS13jLepuzKZ6s0n1CQWxt5KM 9TShfWOgrXXdqowq2Neu5JHIAm19dDFzo0l3pI6ugk3jx2yPS7VFL/9hBnZdyzeuINaK 5ylQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123211; x=1764728011; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=aH2GP6wi0yyOImZ0Dp3pp8w4EIT67hcH8CtToZuUsos=; b=pSnegUEnHfdoRir03HP07FJQD8+Icc1YHiaqDjH6sAWk0DNwlS30KSVNPzewVWMApd x5T/7TJR5mpUaKmCdvPE1MTywj0v/6tcSN4hF92s7iGRucpyUvLZkLhiDXmKMyZDtBY8 6pYd5hIMB8bX9QvNvOKvT/7t7HjwM6toPuQJ7CLx0k1uikYpGfIWMzwJDkJqUA2V+Ja9 y2Kd8sgIykj4BTZhC2xeonqn4ejkxdOwSPuxzLdKjcxBxR3Fqcd+6/YI+oCgVeAveNIy n1JJu6U/hw2Gv/2btA7dLCN4LMiamx1NIQdUlKJhRTVLlD+SGV8uqReHNfKHZ6OVVmc8 vNQw== X-Forwarded-Encrypted: i=1; AJvYcCXvvo7qwG9wVLXirtKrblZHz9NBdN/R1mnbza1EsDrNs0zYfxA6w3UqnLG7VWy2ure/N1hZiqiU23L3lEk=@vger.kernel.org X-Gm-Message-State: AOJu0YwtUtv13tOuOj8ePn4fOfiEC4DVi94aw+CmvM0Kf7hKMwicHy6L JTsIaz3+NKnEj+fnIdnnDRhvhh5p2+rbze/E54+R+d7NRz5MKITrQrw1EgPZeWDuKI8= X-Gm-Gg: ASbGncvYTlBsOTIA7tyv4Pid52cdqmfzoEOj2sl/DtaY15rW2/zb40YraAxBeYQW1sz ya8IXJwdlYE4R+Ije7OIXYrFK3kFoQUrtPI23ZM1zIhPNcFOSy5gdcBxwrxes0jjC2873wanxqL YK5Uzkzs96507Js2DyQMVonFPQMJp70m3SMv9sLy01nF5Q8ubpyyV1dlJS2UJjkzAh7vSJXjatG e90xyWgGiMFYDvU6oFZvJiH6JkuH4+/oaywkhDkeJZwDWEN3zS3my5k9RGINS5LfY2CdAYD7RHA uWKMNkOoJQxeaw2Gi7CfwMJyccmrGKsmNU1DUL8ReVBoilBMHmzA58EFoCxarXQ6fX9bmM/zNtT znJCgb3phF7jjPR3qdAdGvLvTcXeiYAe9IZ0s46HqvLOb0GqX68CapfdpF/bt7z2oQrdyN2l800 6lTvh8+NuEPv4awK9l58MMvExTn1ZCMETyJ00w2NXNfdlh X-Google-Smtp-Source: AGHT+IFvIR3so/zGM7TzoMU9iN0J/zu9A86U8fi764PcWo6NnydsjrOq2DlhO3KmCVnIDz1un0nNTQ== X-Received: by 2002:a05:7022:f909:b0:119:e569:fbad with SMTP id a92af1059eb24-11c9d852d48mr7072685c88.28.1764123210516; Tue, 25 Nov 2025 18:13:30 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:30 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 11/14] nvme-rdma: Use CCR to recover controller that hits an error Date: Tue, 25 Nov 2025 18:11:58 -0800 Message-ID: <20251126021250.2583630-12-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An alive nvme controller that hits an error will now move to RECOVERING state instead of RESETTING state. In RECOVERING state, ctrl->err_work will attempt to use cross-controller recovery to terminate inflight IOs on the controller. If CCR succeeds, then switch to RESETTING state and continue error recovery as usuall by tearing down the controller, and attempting reconnect to target. If CCR fails, the behavior of recovery depends on whether CQT is supported or not. If CQT is supported, switch to time-based recovery by holding inflight IOs until it is safe for them to be retried. If CQT is not supported proceed to retry requests immediately, as the code currently does. To support implementing time-based recovery turn ctrl->err_work into delayed work. Update nvme_rdma_timeout() to not complete inflight IOs while controller in RECOVERING state. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/rdma.c | 51 ++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 190a4cfa8a5e..4a8bb2614468 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -106,7 +106,7 @@ struct nvme_rdma_ctrl { =20 /* other member variables */ struct blk_mq_tag_set tag_set; - struct work_struct err_work; + struct delayed_work err_work; =20 struct nvme_rdma_qe async_event_sqe; =20 @@ -961,7 +961,7 @@ static void nvme_rdma_stop_ctrl(struct nvme_ctrl *nctrl) { struct nvme_rdma_ctrl *ctrl =3D to_rdma_ctrl(nctrl); =20 - flush_work(&ctrl->err_work); + flush_delayed_work(&ctrl->err_work); cancel_delayed_work_sync(&ctrl->reconnect_work); } =20 @@ -1120,11 +1120,46 @@ static void nvme_rdma_reconnect_ctrl_work(struct wo= rk_struct *work) nvme_rdma_reconnect_or_remove(ctrl, ret); } =20 +static int nvme_rdma_recover_ctrl(struct nvme_ctrl *ctrl) +{ + unsigned long rem; + + if (test_and_clear_bit(NVME_CTRL_RECOVERED, &ctrl->flags)) { + dev_info(ctrl->device, "completed time-based recovery\n"); + goto done; + } + + rem =3D nvme_recover_ctrl(ctrl); + if (!rem) + goto done; + + if (!ctrl->cqt) { + dev_info(ctrl->device, + "CCR failed, CQT not supported, skip time-based recovery\n"); + goto done; + } + + dev_info(ctrl->device, + "CCR failed, switch to time-based recovery, timeout =3D %ums\n", + jiffies_to_msecs(rem)); + set_bit(NVME_CTRL_RECOVERED, &ctrl->flags); + queue_delayed_work(nvme_reset_wq, &to_rdma_ctrl(ctrl)->err_work, rem); + return -EAGAIN; + +done: + nvme_end_ctrl_recovery(ctrl); + return 0; +} static void nvme_rdma_error_recovery_work(struct work_struct *work) { - struct nvme_rdma_ctrl *ctrl =3D container_of(work, + struct nvme_rdma_ctrl *ctrl =3D container_of(to_delayed_work(work), struct nvme_rdma_ctrl, err_work); =20 + if (nvme_ctrl_state(&ctrl->ctrl) =3D=3D NVME_CTRL_RECOVERING) { + if (nvme_rdma_recover_ctrl(&ctrl->ctrl)) + return; + } + nvme_stop_keep_alive(&ctrl->ctrl); flush_work(&ctrl->ctrl.async_event_work); nvme_rdma_teardown_io_queues(ctrl, false); @@ -1147,11 +1182,12 @@ static void nvme_rdma_error_recovery_work(struct wo= rk_struct *work) =20 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl) { - if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECOVERING) && + !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) return; =20 dev_warn(ctrl->ctrl.device, "starting error recovery\n"); - queue_work(nvme_reset_wq, &ctrl->err_work); + queue_delayed_work(nvme_reset_wq, &ctrl->err_work, 0); } =20 static void nvme_rdma_end_request(struct nvme_rdma_request *req) @@ -1955,6 +1991,7 @@ static enum blk_eh_timer_return nvme_rdma_timeout(str= uct request *rq) struct nvme_rdma_request *req =3D blk_mq_rq_to_pdu(rq); struct nvme_rdma_queue *queue =3D req->queue; struct nvme_rdma_ctrl *ctrl =3D queue->ctrl; + enum nvme_ctrl_state state =3D nvme_ctrl_state(&ctrl->ctrl); struct nvme_command *cmd =3D req->req.cmd; int qid =3D nvme_rdma_queue_idx(queue); =20 @@ -1963,7 +2000,7 @@ static enum blk_eh_timer_return nvme_rdma_timeout(str= uct request *rq) rq->tag, nvme_cid(rq), cmd->common.opcode, nvme_fabrics_opcode_str(qid, cmd), qid); =20 - if (nvme_ctrl_state(&ctrl->ctrl) !=3D NVME_CTRL_LIVE) { + if (state !=3D NVME_CTRL_LIVE && state !=3D NVME_CTRL_RECOVERING) { /* * If we are resetting, connecting or deleting we should * complete immediately because we may block controller @@ -2280,7 +2317,7 @@ static struct nvme_rdma_ctrl *nvme_rdma_alloc_ctrl(st= ruct device *dev, =20 INIT_DELAYED_WORK(&ctrl->reconnect_work, nvme_rdma_reconnect_ctrl_work); - INIT_WORK(&ctrl->err_work, nvme_rdma_error_recovery_work); + INIT_DELAYED_WORK(&ctrl->err_work, nvme_rdma_error_recovery_work); INIT_WORK(&ctrl->ctrl.reset_work, nvme_rdma_reset_ctrl_work); =20 ctrl->ctrl.queue_count =3D opts->nr_io_queues + opts->nr_write_queues + --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8830E3112C1 for ; Wed, 26 Nov 2025 02:13:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; cv=none; b=h7Ks+iuvZYkUKnoSpRpzH+NGX/O4FVDMAomYLJrO5tb2+cGN5m5gCNDOs2v2oWEp3bqF+6Hbz3210HuvGZjB2WK9+xmb8jTDRmqRCneuFVdt0J5OfjLDTwynTAcg6v6y3AhXZKAXNgelRRIBLo1D0i9MrdJETSz8tuYPI0GZ4tI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; c=relaxed/simple; bh=2Dhg+Gf3VHCFYYumFr6fDuTUsi4yLb6raLqRFabN+n0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sPJU4wja8s9uRW+/ZkBMWZLg9/fNC7ws/UnKZVZi+Q3hSbI0oMbDSKIHJE2MItd6sQ6kxUm6DTYxH1oKjzrcfeJbWnJHWUFcZIB0HrPHBzrVQwQAygnaujYH8k0lDswtSBSbV+WQpqbDxR0pdhwR5l3GRvRNtx0TncBnT+3JxI4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=EWcdH7ll; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="EWcdH7ll" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-2953ad5517dso73831035ad.0 for ; Tue, 25 Nov 2025 18:13:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123213; x=1764728013; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0Fd+Dwjm4hhF+0ivkrKhAcPCv0eUGO971dYvnW029ds=; b=EWcdH7lliYK3ZXM9lJtMKzxA6dU6BGAUxqXoe6msxzbIKagcnO3FOhHYGPKWis8uNF VsZppXO7i67WrglvltFhaWcUqcIXNJbZ+Aac9CgwKzlnMSAqvyWz7vzKYCLZhHG0u6DD lTqb39F6Wtx7UDvGsNy5Ziwh1ImRnWhh+BMvwYDOh9py8CSHDGEQT05EwGCCIPhFLNAe mvVCfekkD+3Vox0nQC9uOBPljc0rsO0ieHtujNoOnhb6spWEA0MbzbH47VnkS/xk2IZO 1Y4yyyar5TbYYpRPyNhacAZ4PS7PgLCyADhG/fIjL/xe0NeWXM8WcpII3+yaJOyCaNUb PC3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123213; x=1764728013; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=0Fd+Dwjm4hhF+0ivkrKhAcPCv0eUGO971dYvnW029ds=; b=VGI0x/HFn6Zd/WBTU62kiiuLZHDjy3MjJD1T3v6IKFF9l5WNPYoWCKy8gZvfgsRlsx W+W/+SziF39s4nEetXdG9uiwyKIQ7Zb4YX1tbSTqV0r0uDbTK1O98Ur18k7t3KUXFbna k9ey+/CQQFLBAHFGTLz64QymiZbSzSLMNIXTDSPpy6OO9q2lWJ6mdaFb90gbVpn4Tkmc htIahqrEZm2USo/MKFFT1rlGih+1qZn5H+Y46BfMYDfzL56HcQwxn+eOURWRAhj4AMlL naSKkn5YGw/9cQiDOpxjVmrWNT10e6NafvNlVveQgB+U8dblfDVJ15727x6wMZugmR7p yGlg== X-Forwarded-Encrypted: i=1; AJvYcCVvEoc9mh7hiLVPBli5QrZq4bLDb8q4qsnEXpt4bGOOknTV2afuKymptyw2Klq7tZto2stEJO8rsRiHyX8=@vger.kernel.org X-Gm-Message-State: AOJu0YyoXsqoc5EMgqfT+eUNGw+sgKDlQgo352jaWw0HNlQsLP8RwlhZ zijkB832g8fNBJiYTreWqnxdg+IM+Yg1nBokW5tYY3dBDBL7DshVedxXZ1Kh1H0gbpQ= X-Gm-Gg: ASbGncudKrNn+qbqigY5xTSObWZUpdl8D9vH0yGtkSpkprmJetjhCfH6dTL1tfTjlEB 5RXFbj9hGzjo2bS/u8JROonY1Ns3aV3VX4wCpwVye3BS1CDNXXswb8BT9n7q9JhkA0gGA7UJYcD 8jd1j01BObgo74i+U5N9T6RFsEbxD2FnYaAGfPgvTwGJw4I6wCBrcti81LYTwet95N103C8E721 N0wQ9DXx0cfCRax3wqzYpHZNUmZSrSrJuuyjd1pFCrV3m8WPjhbOgtDDbJ5LkEQ7O8qst6NsxY7 wo/TCaeXZVO+elTphNC2JK/KT7WAV+CgI1LFYUx69/P9OT1vDpAN83xUooIMWS5KSJ2sNCphOZf vNQpHCIt5wLghsPwMlb9iAHx588MWya/xt1heNsst8GOcZs1VQ8ogz1GhTygSMoPOMwyjFSpYyN q1iNFaGCNBs+bKmvnaV1gHqmdn538kK0L7pg== X-Google-Smtp-Source: AGHT+IEftWKL8xzdyQ8vZoIcNNhpVSoMt6pt3C9ZSmpbTsmF4pmZ3geOTTgQi7gfbDCvEYXRw4ggSw== X-Received: by 2002:a05:7022:2510:b0:11b:9386:a386 with SMTP id a92af1059eb24-11cbba8496fmr3723350c88.41.1764123211315; Tue, 25 Nov 2025 18:13:31 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:31 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 12/14] nvme-fc: Decouple error recovery from controller reset Date: Tue, 25 Nov 2025 18:11:59 -0800 Message-ID: <20251126021250.2583630-13-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nvme_fc_error_recovery() called from nvme_fc_timeout() while controller in CONNECTING state results in deadlock reported in link below. Update nvme_fc_timeout() to schedule error recovery to avoid the deadlock. Previous to this change, if controller was LIVE, error recovery resets the controller. This did not match nvme-tcp and nvme-rdma. Decouple error recovery from controller reset to match other fabric transports. Link: https://lore.kernel.org/all/20250529214928.2112990-1-mkhalfella@pures= torage.com/ Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/fc.c | 94 ++++++++++++++++++------------------------ 1 file changed, 41 insertions(+), 53 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 03987f497a5b..8b6a7c80015c 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -227,6 +227,8 @@ static DEFINE_IDA(nvme_fc_ctrl_cnt); static struct device *fc_udev_device; =20 static void nvme_fc_complete_rq(struct request *rq); +static void nvme_fc_start_ioerr_recovery(struct nvme_fc_ctrl *ctrl, + char *errmsg); =20 /* *********************** FC-NVME Port Management ***********************= * */ =20 @@ -786,7 +788,7 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctr= l) "Reconnect", ctrl->cnum); =20 set_bit(ASSOC_FAILED, &ctrl->flags); - nvme_reset_ctrl(&ctrl->ctrl); + nvme_fc_start_ioerr_recovery(ctrl, "Connectivity Loss"); } =20 /** @@ -983,7 +985,7 @@ fc_dma_unmap_sg(struct device *dev, struct scatterlist = *sg, int nents, static void nvme_fc_ctrl_put(struct nvme_fc_ctrl *); static int nvme_fc_ctrl_get(struct nvme_fc_ctrl *); =20 -static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg= ); +static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl); =20 static void __nvme_fc_finish_ls_req(struct nvmefc_ls_req_op *lsop) @@ -1563,9 +1565,8 @@ nvme_fc_ls_disconnect_assoc(struct nvmefc_ls_rcv_op *= lsop) * for the association have been ABTS'd by * nvme_fc_delete_association(). */ - - /* fail the association */ - nvme_fc_error_recovery(ctrl, "Disconnect Association LS received"); + nvme_fc_start_ioerr_recovery(ctrl, + "Disconnect Association LS received"); =20 /* release the reference taken by nvme_fc_match_disconn_ls() */ nvme_fc_ctrl_put(ctrl); @@ -1867,7 +1868,7 @@ nvme_fc_ctrl_ioerr_work(struct work_struct *work) struct nvme_fc_ctrl *ctrl =3D container_of(work, struct nvme_fc_ctrl, ioerr_work); =20 - nvme_fc_error_recovery(ctrl, "transport detected io error"); + nvme_fc_error_recovery(ctrl); } =20 /* @@ -1888,6 +1889,17 @@ char *nvme_fc_io_getuuid(struct nvmefc_fcp_req *req) } EXPORT_SYMBOL_GPL(nvme_fc_io_getuuid); =20 +static void nvme_fc_start_ioerr_recovery(struct nvme_fc_ctrl *ctrl, + char *errmsg) +{ + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) + return; + + dev_warn(ctrl->ctrl.device, "NVME-FC{%d}: starting error recovery %s\n", + ctrl->cnum, errmsg); + queue_delayed_work(nvme_reset_wq, &ctrl->ioerr_work, 0); +} + static void nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) { @@ -2045,9 +2057,8 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) nvme_fc_complete_rq(rq); =20 check_error: - if (terminate_assoc && - nvme_ctrl_state(&ctrl->ctrl) !=3D NVME_CTRL_RESETTING) - queue_work(nvme_reset_wq, &ctrl->ioerr_work); + if (terminate_assoc) + nvme_fc_start_ioerr_recovery(ctrl, "io error"); } =20 static int @@ -2497,39 +2508,6 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl = *ctrl, bool start_queues) nvme_unquiesce_admin_queue(&ctrl->ctrl); } =20 -static void -nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) -{ - enum nvme_ctrl_state state =3D nvme_ctrl_state(&ctrl->ctrl); - - /* - * if an error (io timeout, etc) while (re)connecting, the remote - * port requested terminating of the association (disconnect_ls) - * or an error (timeout or abort) occurred on an io while creating - * the controller. Abort any ios on the association and let the - * create_association error path resolve things. - */ - if (state =3D=3D NVME_CTRL_CONNECTING) { - __nvme_fc_abort_outstanding_ios(ctrl, true); - dev_warn(ctrl->ctrl.device, - "NVME-FC{%d}: transport error during (re)connect\n", - ctrl->cnum); - return; - } - - /* Otherwise, only proceed if in LIVE state - e.g. on first error */ - if (state !=3D NVME_CTRL_LIVE) - return; - - dev_warn(ctrl->ctrl.device, - "NVME-FC{%d}: transport association event: %s\n", - ctrl->cnum, errmsg); - dev_warn(ctrl->ctrl.device, - "NVME-FC{%d}: resetting controller\n", ctrl->cnum); - - nvme_reset_ctrl(&ctrl->ctrl); -} - static enum blk_eh_timer_return nvme_fc_timeout(struct request *rq) { struct nvme_fc_fcp_op *op =3D blk_mq_rq_to_pdu(rq); @@ -2538,24 +2516,14 @@ static enum blk_eh_timer_return nvme_fc_timeout(str= uct request *rq) struct nvme_fc_cmd_iu *cmdiu =3D &op->cmd_iu; struct nvme_command *sqe =3D &cmdiu->sqe; =20 - /* - * Attempt to abort the offending command. Command completion - * will detect the aborted io and will fail the connection. - */ dev_info(ctrl->ctrl.device, "NVME-FC{%d.%d}: io timeout: opcode %d fctype %d (%s) w10/11: " "x%08x/x%08x\n", ctrl->cnum, qnum, sqe->common.opcode, sqe->fabrics.fctype, nvme_fabrics_opcode_str(qnum, sqe), sqe->common.cdw10, sqe->common.cdw11); - if (__nvme_fc_abort_op(ctrl, op)) - nvme_fc_error_recovery(ctrl, "io timeout abort failed"); =20 - /* - * the io abort has been initiated. Have the reset timer - * restarted and the abort completion will complete the io - * shortly. Avoids a synchronous wait while the abort finishes. - */ + nvme_fc_start_ioerr_recovery(ctrl, "io timeout"); return BLK_EH_RESET_TIMER; } =20 @@ -3347,6 +3315,26 @@ nvme_fc_reset_ctrl_work(struct work_struct *work) } } =20 +static void +nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl) +{ + nvme_stop_keep_alive(&ctrl->ctrl); + nvme_stop_ctrl(&ctrl->ctrl); + + /* will block while waiting for io to terminate */ + nvme_fc_delete_association(ctrl); + + /* Do not reconnect if controller is being deleted */ + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) + return; + + if (ctrl->rport->remoteport.port_state =3D=3D FC_OBJSTATE_ONLINE) { + queue_delayed_work(nvme_wq, &ctrl->connect_work, 0); + return; + } + + nvme_fc_reconnect_or_delete(ctrl, -ENOTCONN); +} =20 static const struct nvme_ctrl_ops nvme_fc_ctrl_ops =3D { .name =3D "fc", --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53A1F3112AB for ; Wed, 26 Nov 2025 02:13:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; cv=none; b=iIRstIdeULBSKQeQvkIcvhsTSRH6p0DNiRkzgCAw3jWO6EWB5f/2n2pJDYu9ITk02z4RWTSVvwBac67KauCW1E8PTAdcgQ7KGUQ6BBdh3tiMVjwk4n/vCTkgADnrBYgPe6PrEWOYE8xVWkCLluE0X4oQrgr8+FEi5AcFU/NXvg0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; c=relaxed/simple; bh=SiJziCso6wAVNLljyCcG+GZhg19EWsYooPO7XAzD1Sg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FnjyUyaBWRqDOR+PIy1uK67ZPA2E6oA2TydCDa/bIzVxCQio5diHVDaCrFxO9Py//t3fBDZc2+TTZdZVSTntVQxKyI4p1lfErFXIeXM6BPhkH0n8xUWvzezUeezdcZH8yXv8z7ODXGKAcNfpCz12iTS4vlSd5ud23G5f8tWwvcY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=bVlwmVrY; arc=none smtp.client-ip=209.85.210.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="bVlwmVrY" Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-7bc248dc16aso4625427b3a.0 for ; Tue, 25 Nov 2025 18:13:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123213; x=1764728013; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o+tOUSVfYAveGgYFyI0DvGkCr7R3J34UcR6zCL8mu2o=; b=bVlwmVrY9qAOpe5pnOg0EgLm4uMnMGeV6q+8sHw8vOO0YEHn+f7R0gvvUc1VbkIDTB 8OZkwK3U+aBL7DRMFKZuELLF0G439pg/lISopiJchyAfqq82nNaATsyp8vUqD2JgBYeK /YaG9v2S4E/X+lwgg5Oa0chaXiloawqC1BXKbR9QDyMi73iKPPHvcIYYLAmXgLKIBH1F 5Ddii3zhiMiYIm1K+4iW4f2/Nmj7GIWxDPZ/xvcWAFHPlIQcZfjHF5jfihkWj5ZCmplu W7GDNhiG2bbH51Un4pGt8g3ETmzQogE66DYDYkbaWrSmDPQRDy1IMSPXORI+xEKY3f7u 7+aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123213; x=1764728013; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=o+tOUSVfYAveGgYFyI0DvGkCr7R3J34UcR6zCL8mu2o=; b=u0er/rw2cO4ExEgtGm7Ba62/ZtihMEZObaliMXxfTFsiTRVPxSQypvQwUmC0TxmrIY pkHRL+0ZHUVpgpK49SoQoTtiVeiMd8Q+H0FOyW25/izJknU4YHPJnYvs5KfWsarEz2g+ 0cLbGGGvGVqdCcoUQk12//P5+LylYVt5kqcI1pu1fHnB9rTQgR2UcqlJZBiFq4gaXvH0 cyka8M0Q8RKrkXhgtt62SRM18Ob/i2JGlecJ7uwB6g0RFZY3jioFVkfXqtYF53zss26t rR3eFodbT1ZfRed018JxW+UlXGXY6BWX0piVSGZCK26tkSbf7jnUKBohUit9CNzIy4Kj jvkg== X-Forwarded-Encrypted: i=1; AJvYcCWwqIh0VvFTIKfr8ZAzfaPg01WoOUOOqCr3C9IqiGyMMZ99RDcO3ief0FZFX3w/KU33NOh6s2KZGzlw00M=@vger.kernel.org X-Gm-Message-State: AOJu0YwqufCwL48yUiCPGTN8LqwUkHF0M/S3ZEIGIQl8ZGw0BPmJmY/E BezO0JqfszCY8F43klsgXvQqEEuA8KiIaQ9t2/nFH6wOOBXfNwalqNsBAu/WMdnyKns= X-Gm-Gg: ASbGncvCwWh6FoemypbvASHZAxV+6moPL/bXkTMYYjdrbhTT4ghhzC312BI0Ib2mZhA e+eePa0G13K9s7vK+R/Tc8KCGbNKVPkjJdUOACWQ2bhpYbcmPeY6jnjaH1D1gefXCdDyov7tbnm gbHWRgQHcbYCJssvBNhrF2dnXhyMrOwqDwQBtD/EeDe52aOexlNWlRf7VlFsTktB4rPOf8qS8Ly 6PnALXmngQ1ynMzYOCENd+BOa0A7gPOwubdpJE2c1mIt19H9yIFbQWC5l4SALLIVwaiiZChGakO kttp+N2B8wE52d9TQXf1Oy/hi/dtgM43lZAUETKRd1WfltkXGJcBvgvUFgSXrL0JrXBJ5kk4E6H yCPeGU6E6eWjbgTqQIujRluJE4etMkstuc3KVh5wU85uRy/Foo6OVQOhZv1FfWxVYL62MaILPiO V2P7kSxVzMHz4GOu/EhARyk8AwgOqObkf0qB0rzk80B5eH X-Google-Smtp-Source: AGHT+IEX52DbN1W3fLNeFgULMbeGoHOJm/Hbe+XFe8rU25L89rCrrZxsAphbup4ofHcY7P6zLwy4Ww== X-Received: by 2002:a05:7022:ff48:b0:119:e56b:989b with SMTP id a92af1059eb24-11cb3ecc810mr3258084c88.2.1764123213235; Tue, 25 Nov 2025 18:13:33 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:32 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 13/14] nvme-fc: Use CCR to recover controller that hits an error Date: Tue, 25 Nov 2025 18:12:00 -0800 Message-ID: <20251126021250.2583630-14-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An alive nvme controller that hits an error will now move to RECOVERING state instead of RESETTING state. In RECOVERING state, ctrl->err_work will attempt to use cross-controller recovery to terminate inflight IOs on the controller. If CCR succeeds, then switch to RESETTING state and continue error recovery as usuall by tearing down the controller, and attempting reconnect to target. If CCR fails, the behavior of recovery depends on whether CQT is supported or not. If CQT is supported, switch to time-based recovery by holding inflight IOs until it is safe for them to be retried. If CQT is not supported proceed to retry requests immediately, as the code currently does. Currently, inflight IOs can get completed during time-based recovery. This will be addressed in the next patch. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/fc.c | 52 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 8b6a7c80015c..0e4d271bb4b6 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -166,7 +166,7 @@ struct nvme_fc_ctrl { struct blk_mq_tag_set admin_tag_set; struct blk_mq_tag_set tag_set; =20 - struct work_struct ioerr_work; + struct delayed_work ioerr_work; struct delayed_work connect_work; =20 struct kref ref; @@ -1862,11 +1862,48 @@ __nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *= ctrl, } } =20 +static int nvme_fc_recover_ctrl(struct nvme_ctrl *ctrl) +{ + unsigned long rem; + + if (test_and_clear_bit(NVME_CTRL_RECOVERED, &ctrl->flags)) { + dev_info(ctrl->device, "completed time-based recovery\n"); + goto done; + } + + rem =3D nvme_recover_ctrl(ctrl); + if (!rem) + goto done; + + if (!ctrl->cqt) { + dev_info(ctrl->device, + "CCR failed, CQT not supported, skip time-based recovery\n"); + goto done; + } + + dev_info(ctrl->device, + "CCR failed, switch to time-based recovery, timeout =3D %ums\n", + jiffies_to_msecs(rem)); + + set_bit(NVME_CTRL_RECOVERED, &ctrl->flags); + queue_delayed_work(nvme_reset_wq, &to_fc_ctrl(ctrl)->ioerr_work, rem); + return -EAGAIN; + +done: + nvme_end_ctrl_recovery(ctrl); + return 0; +} + static void nvme_fc_ctrl_ioerr_work(struct work_struct *work) { - struct nvme_fc_ctrl *ctrl =3D - container_of(work, struct nvme_fc_ctrl, ioerr_work); + struct nvme_fc_ctrl *ctrl =3D container_of(to_delayed_work(work), + struct nvme_fc_ctrl, ioerr_work); + + if (nvme_ctrl_state(&ctrl->ctrl) =3D=3D NVME_CTRL_RECOVERING) { + if (nvme_fc_recover_ctrl(&ctrl->ctrl)) + return; + } =20 nvme_fc_error_recovery(ctrl); } @@ -1892,7 +1929,8 @@ EXPORT_SYMBOL_GPL(nvme_fc_io_getuuid); static void nvme_fc_start_ioerr_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) { - if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECOVERING) && + !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING)) return; =20 dev_warn(ctrl->ctrl.device, "NVME-FC{%d}: starting error recovery %s\n", @@ -3227,7 +3265,7 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl) { struct nvme_fc_ctrl *ctrl =3D to_fc_ctrl(nctrl); =20 - cancel_work_sync(&ctrl->ioerr_work); + cancel_delayed_work_sync(&ctrl->ioerr_work); cancel_delayed_work_sync(&ctrl->connect_work); /* * kill the association on the link side. this will block @@ -3465,7 +3503,7 @@ nvme_fc_alloc_ctrl(struct device *dev, struct nvmf_ct= rl_options *opts, =20 INIT_WORK(&ctrl->ctrl.reset_work, nvme_fc_reset_ctrl_work); INIT_DELAYED_WORK(&ctrl->connect_work, nvme_fc_connect_ctrl_work); - INIT_WORK(&ctrl->ioerr_work, nvme_fc_ctrl_ioerr_work); + INIT_DELAYED_WORK(&ctrl->ioerr_work, nvme_fc_ctrl_ioerr_work); spin_lock_init(&ctrl->lock); =20 /* io queue count */ @@ -3563,7 +3601,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctr= l_options *opts, =20 fail_ctrl: nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING); - cancel_work_sync(&ctrl->ioerr_work); + cancel_delayed_work_sync(&ctrl->ioerr_work); cancel_work_sync(&ctrl->ctrl.reset_work); cancel_delayed_work_sync(&ctrl->connect_work); =20 --=20 2.51.2 From nobody Mon Dec 1 23:34:59 2025 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32631311940 for ; Wed, 26 Nov 2025 02:13:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; cv=none; b=uHJD56Jms9bq4c/FWNBi22anlOIdJa6znY329VFgfCeJtuGpDNAmmfX1v88mZxhOCAn95JC/7GXgpBrjJ8sTGKMHr6AYumWV94BEA1D/ucyBaigMDwvgN2TdtWTR8B0y3Q3bEvQiWeiNZYjhwdkt2B+zGXtGaYsiEChvWaBBHsQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; c=relaxed/simple; bh=UWxAkf7ceBYfAJ1nghx7+50+RPhjRFNdr9Yo5RVZ4Dk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Mu8eGsMZYmPtVsVrvFvZ0Sa3FUmePQtBkTiWKCKefDtOjc4kBrlZZjYPEApqXnVMHdz5jKP29cgSCpOpKY0l5dGpciIAwPbgWPEd9pPgewTDFnsCD87aaDRjza76ZYaYt9kB+lLXrHBRYBEGpZWizUgn+kOaVv7V2tpENisU6qM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=U3fU/xiL; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="U3fU/xiL" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-7b9c17dd591so5389858b3a.3 for ; Tue, 25 Nov 2025 18:13:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123214; x=1764728014; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FbK5fZiutYhedIPWYxEBAuO5w3UfHl/USBKz0oqynIM=; b=U3fU/xiLYDq1KuieuCVdpYFvM7yYYeccI+asbRllHXSmxOKqLqNeiGkH66U4SYTjzd FUbhFI2AcbbYW5zMFXTTqfSaKGnQ4LSSlao861SjAARvKJAhF5vRjh7KmErqV8lcVuu9 boOsFGs4pMU3IVHTEH7BR/rzCd/wcRXNeRaAvrb7wQOp3PvIzxEk+Aj6yz09qejCtoN9 LN96Ad6hkzJDw/btboKRkhKkcSmIg+2L+jge2d5IesqcChQ8VeI2V/QCP8HCBXiCBMKs KO8fFRoRsqtRlbo15MrBNFANc0hPMJNYEjmchci1UPaFpTJhDsr/G9Ust/qDC2Gc5AES 9fwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123214; x=1764728014; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=FbK5fZiutYhedIPWYxEBAuO5w3UfHl/USBKz0oqynIM=; b=uFho8F6eJZROH2N+UJWMKwEXxBt1aAITsO8/oM57Dwhj5rmZ3Sr5vPefiY/eKP4XFo wGgjHrXWAFKMNNzKfBmHTO41lF/DMtqUYBMOF3ZfhvoIyd/SjwUyPRN5uvG7dr4Ob3ly AtXJ/HC7EvtuPxPj2L5XkNRneJE8SwOrP8ms5V0lutoRkeZJmbg+WR9KRziZV9PqvMDC k2UkBAeXgthvggyxqn2YhU6RN0Kel3HHHF3TeHyU2fNZDjU3zGhGDWH4+voYkjO+/2yV Z3rDERrHusJ7juvjTe7y3SQ5KVtQOKSvf4Sj7qTS8UHtejV4Hi04HtT/6onppV1655RB insw== X-Forwarded-Encrypted: i=1; AJvYcCVLv8rTBA5oFfAV1Lsp9VM1RAzpooUgFiQus9DJZ1c1260PVIJiR8tljAfylrSntrUJubnxm6wcuBF0Ph0=@vger.kernel.org X-Gm-Message-State: AOJu0YzWSHhFzXXbe/1MH0NQuj13XnKaJMxQJKOuWE63+ZwMBdyFEEb1 ZUc2g7VBoQFvxBA3sBtX6jzhP9H4Zkqx1cjrFoE2ZP5EiytjygKehtUbb7H8bA2FPqU= X-Gm-Gg: ASbGncttb7kRhh5gZkHcDlrBc7FhLkdpBQyA09hKL41utbMVxF/Iix8TU4QP1ntc65X 1ri5H5c6zuKIK6a/+yUc9g5Aq1JWiy/xO3YBqvxaqGCx5u9GNWEUydzTFASO82+Xa+o0n6AMRxS Za/m+P7aLU76eG60VOenqqYQDWLxpl73p/ahNQO4j3XAwrbv1DRr5u/zxHMhERQ87cffYdp95Q8 KEqaON4s4hb5D56TCnuv7+GN4IyYDjoQKAEPKb3NGkGscMZQ/TBAAm7pczw/I2ACqDyPrSHPSKb xpr/nZktmIdzDuvdp2U/q4ZZn2Z1d0+a4RbW8i5XdDBvWHylIVBGIYLJyr1imSIjIAJGizP6fxa 1eLqOFtS0VEqqDE0oHaSoqfp5VsV2tBKSxqPof0maVSvXBav9ewGfCXGfYiCDESaehNeXEmJtAR 2gRCN7e1emhPuj8s6anqrUdpq/tQnTADNlvA== X-Google-Smtp-Source: AGHT+IERM1coBAz6VHaGSPU/mKrv39LOQAC7fg3IEBFYYqWZMiyIxbaDOZPcp2pxCE0bN+fnmO+nJw== X-Received: by 2002:a05:7022:797:b0:11b:ceee:a46d with SMTP id a92af1059eb24-11cb3edd0fbmr3397464c88.15.1764123214127; Tue, 25 Nov 2025 18:13:34 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:33 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 14/14] nvme-fc: Hold inflight requests while in RECOVERING state Date: Tue, 25 Nov 2025 18:12:01 -0800 Message-ID: <20251126021250.2583630-15-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nvme_fc_delete_association() called from error recovery codepath waits for all requests to be completed. In RECOVERING state inflight IOs should be held until it is safe to for them to be retried. Update nvme_fc_fcpio_done() to not complete requests while in RECOVERING state. Update recovery codepath to cancel inflight requests similar to what nvme-tcp and nvme-rdma do today. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/fc.c | 50 +++++++++++++++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 0e4d271bb4b6..1b4f42358f37 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -171,7 +171,7 @@ struct nvme_fc_ctrl { =20 struct kref ref; unsigned long flags; - u32 iocnt; + atomic_t iocnt; wait_queue_head_t ioabort_wait; =20 struct nvme_fc_fcp_op aen_ops[NVME_NR_AEN_COMMANDS]; @@ -1816,7 +1816,7 @@ __nvme_fc_abort_op(struct nvme_fc_ctrl *ctrl, struct = nvme_fc_fcp_op *op) atomic_set(&op->state, opstate); else if (test_bit(FCCTRL_TERMIO, &ctrl->flags)) { op->flags |=3D FCOP_FLAGS_TERMIO; - ctrl->iocnt++; + atomic_inc(&ctrl->iocnt); } spin_unlock_irqrestore(&ctrl->lock, flags); =20 @@ -1846,20 +1846,29 @@ nvme_fc_abort_aen_ops(struct nvme_fc_ctrl *ctrl) } =20 static inline void +__nvme_fc_fcpop_count_one_down(struct nvme_fc_ctrl *ctrl) +{ + if (atomic_dec_return(&ctrl->iocnt) =3D=3D 0) + wake_up(&ctrl->ioabort_wait); +} + +static inline bool __nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *ctrl, struct nvme_fc_fcp_op *op, int opstate) { unsigned long flags; + bool ret =3D false; =20 if (opstate =3D=3D FCPOP_STATE_ABORTED) { spin_lock_irqsave(&ctrl->lock, flags); if (test_bit(FCCTRL_TERMIO, &ctrl->flags) && op->flags & FCOP_FLAGS_TERMIO) { - if (!--ctrl->iocnt) - wake_up(&ctrl->ioabort_wait); + ret =3D true; } spin_unlock_irqrestore(&ctrl->lock, flags); } + + return ret; } =20 static int nvme_fc_recover_ctrl(struct nvme_ctrl *ctrl) @@ -1950,7 +1959,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) struct nvme_command *sqe =3D &op->cmd_iu.sqe; __le16 status =3D cpu_to_le16(NVME_SC_SUCCESS << 1); union nvme_result result; - bool terminate_assoc =3D true; + bool op_term, terminate_assoc =3D true; int opstate; =20 /* @@ -2083,17 +2092,34 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) done: if (op->flags & FCOP_FLAGS_AEN) { nvme_complete_async_event(&queue->ctrl->ctrl, status, &result); - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate)) + __nvme_fc_fcpop_count_one_down(ctrl); atomic_set(&op->state, FCPOP_STATE_IDLE); op->flags =3D FCOP_FLAGS_AEN; /* clear other flags */ nvme_fc_ctrl_put(ctrl); goto check_error; } =20 - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + /* + * We can not access op after the request is completed because it can + * be reused immediately. At the same time we want to wakeup the thread + * waiting for ongoing IOs _after_ requests are completed. This is + * necessary because that thread will start canceling inflight IOs + * and we want to avoid request completion racing with cancellation. + */ + op_term =3D __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + + /* Error recovery completes inflight reqeusts when it is safe */ + if (nvme_ctrl_state(&ctrl->ctrl) =3D=3D NVME_CTRL_RECOVERING) + goto check_op_term; + if (!nvme_try_complete_req(rq, status, result)) nvme_fc_complete_rq(rq); =20 +check_op_term: + if (op_term) + __nvme_fc_fcpop_count_one_down(ctrl); + check_error: if (terminate_assoc) nvme_fc_start_ioerr_recovery(ctrl, "io error"); @@ -2737,7 +2763,8 @@ nvme_fc_start_fcp_op(struct nvme_fc_ctrl *ctrl, struc= t nvme_fc_queue *queue, * cmd with the csn was supposed to arrive. */ opstate =3D atomic_xchg(&op->state, FCPOP_STATE_COMPLETE); - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate)) + __nvme_fc_fcpop_count_one_down(ctrl); =20 if (!(op->flags & FCOP_FLAGS_AEN)) { nvme_fc_unmap_data(ctrl, op->rq, op); @@ -3206,7 +3233,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl) =20 spin_lock_irqsave(&ctrl->lock, flags); set_bit(FCCTRL_TERMIO, &ctrl->flags); - ctrl->iocnt =3D 0; + atomic_set(&ctrl->iocnt, 0); spin_unlock_irqrestore(&ctrl->lock, flags); =20 __nvme_fc_abort_outstanding_ios(ctrl, false); @@ -3215,8 +3242,8 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl) nvme_fc_abort_aen_ops(ctrl); =20 /* wait for all io that had to be aborted */ + wait_event(ctrl->ioabort_wait, atomic_read(&ctrl->iocnt) =3D=3D 0); spin_lock_irq(&ctrl->lock); - wait_event_lock_irq(ctrl->ioabort_wait, ctrl->iocnt =3D=3D 0, ctrl->lock); clear_bit(FCCTRL_TERMIO, &ctrl->flags); spin_unlock_irq(&ctrl->lock); =20 @@ -3362,6 +3389,9 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl) /* will block while waiting for io to terminate */ nvme_fc_delete_association(ctrl); =20 + nvme_cancel_tagset(&ctrl->ctrl); + nvme_cancel_admin_tagset(&ctrl->ctrl); + /* Do not reconnect if controller is being deleted */ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) return; --=20 2.51.2