From nobody Tue Dec 2 00:05:28 2025 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32631311940 for ; Wed, 26 Nov 2025 02:13:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; cv=none; b=uHJD56Jms9bq4c/FWNBi22anlOIdJa6znY329VFgfCeJtuGpDNAmmfX1v88mZxhOCAn95JC/7GXgpBrjJ8sTGKMHr6AYumWV94BEA1D/ucyBaigMDwvgN2TdtWTR8B0y3Q3bEvQiWeiNZYjhwdkt2B+zGXtGaYsiEChvWaBBHsQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764123216; c=relaxed/simple; bh=UWxAkf7ceBYfAJ1nghx7+50+RPhjRFNdr9Yo5RVZ4Dk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Mu8eGsMZYmPtVsVrvFvZ0Sa3FUmePQtBkTiWKCKefDtOjc4kBrlZZjYPEApqXnVMHdz5jKP29cgSCpOpKY0l5dGpciIAwPbgWPEd9pPgewTDFnsCD87aaDRjza76ZYaYt9kB+lLXrHBRYBEGpZWizUgn+kOaVv7V2tpENisU6qM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=U3fU/xiL; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="U3fU/xiL" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-7b9c17dd591so5389858b3a.3 for ; Tue, 25 Nov 2025 18:13:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1764123214; x=1764728014; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FbK5fZiutYhedIPWYxEBAuO5w3UfHl/USBKz0oqynIM=; b=U3fU/xiLYDq1KuieuCVdpYFvM7yYYeccI+asbRllHXSmxOKqLqNeiGkH66U4SYTjzd FUbhFI2AcbbYW5zMFXTTqfSaKGnQ4LSSlao861SjAARvKJAhF5vRjh7KmErqV8lcVuu9 boOsFGs4pMU3IVHTEH7BR/rzCd/wcRXNeRaAvrb7wQOp3PvIzxEk+Aj6yz09qejCtoN9 LN96Ad6hkzJDw/btboKRkhKkcSmIg+2L+jge2d5IesqcChQ8VeI2V/QCP8HCBXiCBMKs KO8fFRoRsqtRlbo15MrBNFANc0hPMJNYEjmchci1UPaFpTJhDsr/G9Ust/qDC2Gc5AES 9fwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764123214; x=1764728014; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=FbK5fZiutYhedIPWYxEBAuO5w3UfHl/USBKz0oqynIM=; b=uFho8F6eJZROH2N+UJWMKwEXxBt1aAITsO8/oM57Dwhj5rmZ3Sr5vPefiY/eKP4XFo wGgjHrXWAFKMNNzKfBmHTO41lF/DMtqUYBMOF3ZfhvoIyd/SjwUyPRN5uvG7dr4Ob3ly AtXJ/HC7EvtuPxPj2L5XkNRneJE8SwOrP8ms5V0lutoRkeZJmbg+WR9KRziZV9PqvMDC k2UkBAeXgthvggyxqn2YhU6RN0Kel3HHHF3TeHyU2fNZDjU3zGhGDWH4+voYkjO+/2yV Z3rDERrHusJ7juvjTe7y3SQ5KVtQOKSvf4Sj7qTS8UHtejV4Hi04HtT/6onppV1655RB insw== X-Forwarded-Encrypted: i=1; AJvYcCVLv8rTBA5oFfAV1Lsp9VM1RAzpooUgFiQus9DJZ1c1260PVIJiR8tljAfylrSntrUJubnxm6wcuBF0Ph0=@vger.kernel.org X-Gm-Message-State: AOJu0YzWSHhFzXXbe/1MH0NQuj13XnKaJMxQJKOuWE63+ZwMBdyFEEb1 ZUc2g7VBoQFvxBA3sBtX6jzhP9H4Zkqx1cjrFoE2ZP5EiytjygKehtUbb7H8bA2FPqU= X-Gm-Gg: ASbGncttb7kRhh5gZkHcDlrBc7FhLkdpBQyA09hKL41utbMVxF/Iix8TU4QP1ntc65X 1ri5H5c6zuKIK6a/+yUc9g5Aq1JWiy/xO3YBqvxaqGCx5u9GNWEUydzTFASO82+Xa+o0n6AMRxS Za/m+P7aLU76eG60VOenqqYQDWLxpl73p/ahNQO4j3XAwrbv1DRr5u/zxHMhERQ87cffYdp95Q8 KEqaON4s4hb5D56TCnuv7+GN4IyYDjoQKAEPKb3NGkGscMZQ/TBAAm7pczw/I2ACqDyPrSHPSKb xpr/nZktmIdzDuvdp2U/q4ZZn2Z1d0+a4RbW8i5XdDBvWHylIVBGIYLJyr1imSIjIAJGizP6fxa 1eLqOFtS0VEqqDE0oHaSoqfp5VsV2tBKSxqPof0maVSvXBav9ewGfCXGfYiCDESaehNeXEmJtAR 2gRCN7e1emhPuj8s6anqrUdpq/tQnTADNlvA== X-Google-Smtp-Source: AGHT+IERM1coBAz6VHaGSPU/mKrv39LOQAC7fg3IEBFYYqWZMiyIxbaDOZPcp2pxCE0bN+fnmO+nJw== X-Received: by 2002:a05:7022:797:b0:11b:ceee:a46d with SMTP id a92af1059eb24-11cb3edd0fbmr3397464c88.15.1764123214127; Tue, 25 Nov 2025 18:13:34 -0800 (PST) Received: from apollo.purestorage.com ([208.88.152.253]) by smtp.googlemail.com with ESMTPSA id a92af1059eb24-11cc631c236sm17922979c88.7.2025.11.25.18.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 18:13:33 -0800 (PST) From: Mohamed Khalfella To: Chaitanya Kulkarni , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg Cc: Aaron Dailey , Randy Jennings , John Meneghini , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Mohamed Khalfella Subject: [RFC PATCH 14/14] nvme-fc: Hold inflight requests while in RECOVERING state Date: Tue, 25 Nov 2025 18:12:01 -0800 Message-ID: <20251126021250.2583630-15-mkhalfella@purestorage.com> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251126021250.2583630-1-mkhalfella@purestorage.com> References: <20251126021250.2583630-1-mkhalfella@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nvme_fc_delete_association() called from error recovery codepath waits for all requests to be completed. In RECOVERING state inflight IOs should be held until it is safe to for them to be retried. Update nvme_fc_fcpio_done() to not complete requests while in RECOVERING state. Update recovery codepath to cancel inflight requests similar to what nvme-tcp and nvme-rdma do today. Signed-off-by: Mohamed Khalfella --- drivers/nvme/host/fc.c | 50 +++++++++++++++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 0e4d271bb4b6..1b4f42358f37 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -171,7 +171,7 @@ struct nvme_fc_ctrl { =20 struct kref ref; unsigned long flags; - u32 iocnt; + atomic_t iocnt; wait_queue_head_t ioabort_wait; =20 struct nvme_fc_fcp_op aen_ops[NVME_NR_AEN_COMMANDS]; @@ -1816,7 +1816,7 @@ __nvme_fc_abort_op(struct nvme_fc_ctrl *ctrl, struct = nvme_fc_fcp_op *op) atomic_set(&op->state, opstate); else if (test_bit(FCCTRL_TERMIO, &ctrl->flags)) { op->flags |=3D FCOP_FLAGS_TERMIO; - ctrl->iocnt++; + atomic_inc(&ctrl->iocnt); } spin_unlock_irqrestore(&ctrl->lock, flags); =20 @@ -1846,20 +1846,29 @@ nvme_fc_abort_aen_ops(struct nvme_fc_ctrl *ctrl) } =20 static inline void +__nvme_fc_fcpop_count_one_down(struct nvme_fc_ctrl *ctrl) +{ + if (atomic_dec_return(&ctrl->iocnt) =3D=3D 0) + wake_up(&ctrl->ioabort_wait); +} + +static inline bool __nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *ctrl, struct nvme_fc_fcp_op *op, int opstate) { unsigned long flags; + bool ret =3D false; =20 if (opstate =3D=3D FCPOP_STATE_ABORTED) { spin_lock_irqsave(&ctrl->lock, flags); if (test_bit(FCCTRL_TERMIO, &ctrl->flags) && op->flags & FCOP_FLAGS_TERMIO) { - if (!--ctrl->iocnt) - wake_up(&ctrl->ioabort_wait); + ret =3D true; } spin_unlock_irqrestore(&ctrl->lock, flags); } + + return ret; } =20 static int nvme_fc_recover_ctrl(struct nvme_ctrl *ctrl) @@ -1950,7 +1959,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) struct nvme_command *sqe =3D &op->cmd_iu.sqe; __le16 status =3D cpu_to_le16(NVME_SC_SUCCESS << 1); union nvme_result result; - bool terminate_assoc =3D true; + bool op_term, terminate_assoc =3D true; int opstate; =20 /* @@ -2083,17 +2092,34 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) done: if (op->flags & FCOP_FLAGS_AEN) { nvme_complete_async_event(&queue->ctrl->ctrl, status, &result); - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate)) + __nvme_fc_fcpop_count_one_down(ctrl); atomic_set(&op->state, FCPOP_STATE_IDLE); op->flags =3D FCOP_FLAGS_AEN; /* clear other flags */ nvme_fc_ctrl_put(ctrl); goto check_error; } =20 - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + /* + * We can not access op after the request is completed because it can + * be reused immediately. At the same time we want to wakeup the thread + * waiting for ongoing IOs _after_ requests are completed. This is + * necessary because that thread will start canceling inflight IOs + * and we want to avoid request completion racing with cancellation. + */ + op_term =3D __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + + /* Error recovery completes inflight reqeusts when it is safe */ + if (nvme_ctrl_state(&ctrl->ctrl) =3D=3D NVME_CTRL_RECOVERING) + goto check_op_term; + if (!nvme_try_complete_req(rq, status, result)) nvme_fc_complete_rq(rq); =20 +check_op_term: + if (op_term) + __nvme_fc_fcpop_count_one_down(ctrl); + check_error: if (terminate_assoc) nvme_fc_start_ioerr_recovery(ctrl, "io error"); @@ -2737,7 +2763,8 @@ nvme_fc_start_fcp_op(struct nvme_fc_ctrl *ctrl, struc= t nvme_fc_queue *queue, * cmd with the csn was supposed to arrive. */ opstate =3D atomic_xchg(&op->state, FCPOP_STATE_COMPLETE); - __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate); + if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate)) + __nvme_fc_fcpop_count_one_down(ctrl); =20 if (!(op->flags & FCOP_FLAGS_AEN)) { nvme_fc_unmap_data(ctrl, op->rq, op); @@ -3206,7 +3233,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl) =20 spin_lock_irqsave(&ctrl->lock, flags); set_bit(FCCTRL_TERMIO, &ctrl->flags); - ctrl->iocnt =3D 0; + atomic_set(&ctrl->iocnt, 0); spin_unlock_irqrestore(&ctrl->lock, flags); =20 __nvme_fc_abort_outstanding_ios(ctrl, false); @@ -3215,8 +3242,8 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl) nvme_fc_abort_aen_ops(ctrl); =20 /* wait for all io that had to be aborted */ + wait_event(ctrl->ioabort_wait, atomic_read(&ctrl->iocnt) =3D=3D 0); spin_lock_irq(&ctrl->lock); - wait_event_lock_irq(ctrl->ioabort_wait, ctrl->iocnt =3D=3D 0, ctrl->lock); clear_bit(FCCTRL_TERMIO, &ctrl->flags); spin_unlock_irq(&ctrl->lock); =20 @@ -3362,6 +3389,9 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl) /* will block while waiting for io to terminate */ nvme_fc_delete_association(ctrl); =20 + nvme_cancel_tagset(&ctrl->ctrl); + nvme_cancel_admin_tagset(&ctrl->ctrl); + /* Do not reconnect if controller is being deleted */ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) return; --=20 2.51.2