From nobody Mon Feb 9 00:26:18 2026 Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B0DC1DF751 for ; Fri, 26 Dec 2025 17:54:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766771670; cv=none; b=UF+GhgtmdGed2CHww8hRovGX4Zdru7LCFVrkdyesvyl1nj4ZmzjFd7V+Y0/TOJLAvxuhXpDDUXK4tJBhNo/0oueMAIxtV4VSr6YE3YiBGvFyOPzB4PexvEE3OjmSH9TPtZDIQB8DGiKEQsMFTjM3CnujROyh6bIEn/NvyiWH394= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766771670; c=relaxed/simple; bh=1m9qiFtkq9lRrMzjv95185/VZrcaM0nqScvCTtc8M+o=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=Z6xlIoyjcF24YQ7EDC4eIVhjnGup92twpWhtxVpZSmcphiNwwiVD3c8AMmKemnGAejkT8tmpintHcuXISLYCgu+4pxnIYJkAzeqgHqrQIc7ESCL9nzbAl1DNIetMyuJenxT1CMXTJLoYPZEz8F7AmvOBzSD9s7gE5Eem5PD/whQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YxPe/oq0; arc=none smtp.client-ip=209.85.215.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YxPe/oq0" Received: by mail-pg1-f180.google.com with SMTP id 41be03b00d2f7-bc4b952cc9dso6283450a12.3 for ; Fri, 26 Dec 2025 09:54:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766771668; x=1767376468; darn=vger.kernel.org; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:from:to:cc:subject:date:message-id:reply-to; bh=RDD+4AZjb5jy8rH4mCTP8p/da+orH/MxAX6wbIo8W58=; b=YxPe/oq0gzrFJ6i2Re1T1an9Xlgyuj7NOUXm8T9Jpe2X75ukS7Uq5FTKcjsNfFmasX tESSP2+En8cszXU4ZkP74g73A8WDp7AURmoDnccYMCJfqPY5hp1xVunfHiV3VxZM9cRH Al52JZ1jbE+U1MxRtZ43dD5z9EQ2vIq6aJB8Jl0uchT7Odkel55vVoTRsAYYYGV5zRIL OzsPbY6GZBYFytpuWjIHSK7BfqsXRL/oXYSMWuHvrg12cHm6MWK8JuAMOJvAYMLua5aF GkDXEvH3p+a86MRy0a4I0L0KmlWOOOXbTqNTDu+ZTK3FaeAZMM2RMV1ZD+Nljs0FvZc1 86tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766771668; x=1767376468; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RDD+4AZjb5jy8rH4mCTP8p/da+orH/MxAX6wbIo8W58=; b=Ni0UAbaHrcZYFxTbE/qcajZB1Vq+q/Z7DVMS8eFuC/3LKz7qMUiV70EDFPEx4iEkTg i9dWtcGaoULbQ4pL0FBNAq6DVhIcadcFBHKsMegeQokxm/d+H8eTxix53Q1dc67NB+dH T7iDoJhgfOppLWxULuWuXiwycvXNEhq4bApO4qd1LsNedjn08X6w/aHxWhoZ3V89sRFk hB857lDuV1DBJnsGDcAfkZhqsuk/YpExiKuLatgtTQDpio+78miNuhDCc/sn8F7mkB4p nS/2EFDc+J1Bi8tlYdt1bCHyFW8P9mvmRhqHelFK85yMy2MS6uKNivxR6o0ARyqA3pUi Ksmw== X-Forwarded-Encrypted: i=1; AJvYcCVvBIFZrmZv1Jlhg7dRzzXtTDAVU27VxgiVzNd37Bcrj0yKu2SrXjL67wv8ushizML8yCaL2/JYlw6Uquc=@vger.kernel.org X-Gm-Message-State: AOJu0YwAtuB96Y0OXRMaFj2lcsyfW9bgHIGzgd9/I9B7uPdOyJd/eXAc dvsnKAYwRZQSj2PPFbEYXfOWwjfWyzgc6/FWjV1JiAU4irp6+WVKibaD X-Gm-Gg: AY/fxX7Y1dkpwjCVEjrCvcBP+Thyft9XL8Dxk/AjGQf+u1zBeT2cvPOmtVZvIM9I/n0 sfKfMrfl0kYyBKH/NLzNbsTwqCGSBUT0dG4zaY0RDeuPVu7dOBCThFz+g+qmtmJEQbb95Roqfgi +6FuU5B0tbn8xIOYditQhnfQV6KFvAQMj67J1p0VHLCnGpAeGvzgAz2lLKxqPw0ZOlPptwjbK+Y Und4oFYMHdvFRcmjR2WJr2ZFqNju3cwvg2KkRhtxiA7fexWAezYqMj5a1oaQDdPm62vYHYEK9dK xxL9JY20siZOYqwSvv9nFv9FYWWU9nuuBGdi0PrG/13z1E0LdEeLm8EfG8qL1Z1cHEi7A2hhd/H wQYdB7c9NMJl3QQPGirqqgruZZPqIAoV7NFOo69rBBgAlZWNVfM8W+PwI1MVwZoIWMRpQ0TWbkQ gusipJHzRsDn57iMAk3cvpCg== X-Google-Smtp-Source: AGHT+IGT4BXgwvcgfN1XwVI7hxe02ZWZz2JOxOkM2rm2GdH33vbYynGiBY9lZ3IEMxxK1ofhA8PBBg== X-Received: by 2002:a05:7300:f58b:b0:2ae:6118:dbce with SMTP id 5a478bee46e88-2b05ec05052mr12098473eec.10.1766771668220; Fri, 26 Dec 2025 09:54:28 -0800 (PST) Received: from [192.168.5.81] ([172.59.162.202]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b05ffad66fsm54112002eec.4.2025.12.26.09.54.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Dec 2025 09:54:27 -0800 (PST) From: Alex Tran Date: Fri, 26 Dec 2025 09:54:19 -0800 Subject: [PATCH v2] nvme/host: add delayed retries upon non-fatal error during ns validation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251226-nvme_ns_validation-v2-1-0e1b3a142553@gmail.com> X-B4-Tracking: v=1; b=H4sIAMvLTmkC/x2MQQqAIBAAvxJ7TrANKfpKhGy51UJZaEgQ/T3pO DAzD0QOwhG64oHASaIcPgOWBUwr+YWVuMyAGk2FaJRPO1sfbaJNHF1ZV3WlSTdE7Ygj5PAMPMv 9T/vhfT86JQe5ZAAAAA== X-Change-ID: 20251225-nvme_ns_validation-310a07aa8b2b To: Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, Alex Tran X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=3650; i=alex.t.tran@gmail.com; h=from:subject:message-id; bh=1m9qiFtkq9lRrMzjv95185/VZrcaM0nqScvCTtc8M+o=; b=owEBbQKS/ZANAwAKAXT5fTREJs3IAcsmYgBpTsvTkK7tAJytUupLnOJngWO1jaRQYrrTtebpl PGxSrvSsFiJAjMEAAEKAB0WIQQAohViG04SVxUVrcd0+X00RCbNyAUCaU7L0wAKCRB0+X00RCbN yOyDEACtHZqrXeVMeJJD7/2FtpbTKKFyIYMHGkiitO95lNbKoKFJJ2fZ+W5nc/NlV8h/E6g3P7N okGukoxdVWexBmv2fwoo3yLF6KAsxeFdC+kDYw5v1zP4KYQCGLPXcS6dysCId1yBX9Envc3YLty aXxi+PLLXSt7jr98dOm3a6axq6s6dOZE5sHUgnEGoXQX8ORdLUN3BTRgwReYd6T1YkWGJpkAYwD hmgEqz+g5uE4ybpCyWplTZkJrJJoZElcSMp9VmHP07mxdE1kBl/JJ6oZNq7CcUy/U4N9RxPArda BEQzuZvPMcCOVzM+PV8A5OcWj38/KWDLsOPkWRF6S9v2k1ITCEtpmu2VMk0BTPNP8XnTwfWhvxB 93QQJgdvtEX2rAT6KuKk4kD6n3q3zDL2By2MlOTvJ+QVLi5qbaXcca8oHcXQ/GqEkEh5YudlsGM PS2NfW/GnJTetkvzsExfR5zpgmPwpNmvtRHdUFWp2VxpT4tvkduSUCSE2o/rwkjkEozfB4IF6Om MJR3GWewFwDLSu5ToPhRS7lnbJXVcYHUe8Gun18uP86zkAPEKnQGmQ6atdIWi/4rMDHRn7jwa+N eU6Q2OtPxZ624hRU3dqKgcG5/j06AtpwdQrabYDuMnmIoqRyLSG2ys2jXwySxBUhCsCneC5/nGh 3eDJjV7jNnXJdeg== X-Developer-Key: i=alex.t.tran@gmail.com; a=openpgp; fpr=00A215621B4E12571515ADC774F97D344426CDC8 If a non-fatal error is received during nvme namespace validation, it should not be ignored and the namespace should not be removed immediately. Rather, delayed retries should be performed on the namespace validation process. This handles non-fatal issues more robustly, by retrying a few times before giving up and removing the namespace. The number of retries is set to 3 and the interval between retries is set to 3 seconds. The retries are handled locally. Upon success then end. Upon fatal error then remove the namespace before ending. Upon non-fatal error, retry until the max retry amount is hit before ending. Signed-off-by: Alex Tran --- Changes in v2: - Simplify retry logic with local loop instead of delayed work. --- drivers/nvme/host/core.c | 37 +++++++++++++++++++++++++++++-------- drivers/nvme/host/nvme.h | 6 ++++++ 2 files changed, 35 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7bf228df6001f1f4d0b3c570de285a5eb17bb08e..6cd1ce4a60af55e5673e5fd0ec2= 053a028fae4f2 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4289,23 +4289,44 @@ static void nvme_ns_remove_by_nsid(struct nvme_ctrl= *ctrl, u32 nsid) static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_info *info) { int ret =3D NVME_SC_INVALID_NS | NVME_STATUS_DNR; + unsigned int i; =20 if (!nvme_ns_ids_equal(&ns->head->ids, &info->ids)) { - dev_err(ns->ctrl->device, - "identifiers changed for nsid %d\n", ns->head->ns_id); + dev_err(ns->ctrl->device, "identifiers changed for nsid %d\n", + ns->head->ns_id); goto out; } =20 - ret =3D nvme_update_ns_info(ns, info); + for (i =3D 0; i <=3D NVME_NS_VALIDATION_MAX_RETRIES; i++) { + ret =3D nvme_update_ns_info(ns, info); + if (ret =3D=3D 0) + return; + + if (ret > 0 && (ret & NVME_STATUS_DNR)) + goto out; + + if (i =3D=3D NVME_NS_VALIDATION_MAX_RETRIES) { + dev_err(ns->ctrl->device, + "validation failed for nsid %d after %d retries\n", + ns->head->ns_id, + NVME_NS_VALIDATION_MAX_RETRIES); + return; + } + + dev_warn(ns->ctrl->device, + "validation failed for nsid %d, retry %d/%d in %dms\n", + ns->head->ns_id, i + 1, NVME_NS_VALIDATION_MAX_RETRIES, + NVME_NS_VALIDATION_RETRY_INTERVAL); + + msleep(NVME_NS_VALIDATION_RETRY_INTERVAL); + } + out: /* * Only remove the namespace if we got a fatal error back from the - * device, otherwise ignore the error and just move on. - * - * TODO: we should probably schedule a delayed retry here. + * device. */ - if (ret > 0 && (ret & NVME_STATUS_DNR)) - nvme_ns_remove(ns); + nvme_ns_remove(ns); } =20 static void nvme_scan_ns(struct nvme_ctrl *ctrl, unsigned nsid) diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 9a5f28c5103c5c42777bd9309a983ef0196c1b95..dcbdc7fa8af0cb838b3f1a774d4= c67fa69a00050 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -46,6 +46,12 @@ extern unsigned int admin_timeout; #define NVME_CTRL_PAGE_SHIFT 12 #define NVME_CTRL_PAGE_SIZE (1 << NVME_CTRL_PAGE_SHIFT) =20 +/* + * Default to 3 retries in intervals of 3000ms for namespace validation + */ +#define NVME_NS_VALIDATION_MAX_RETRIES 3 +#define NVME_NS_VALIDATION_RETRY_INTERVAL 3000 + extern struct workqueue_struct *nvme_wq; extern struct workqueue_struct *nvme_reset_wq; extern struct workqueue_struct *nvme_delete_wq; --- base-commit: fa084c35afa13ab07a860ef0936cd987f9aa0460 change-id: 20251225-nvme_ns_validation-310a07aa8b2b Best regards, --=20 Alex Tran