From nobody Sat Jul 4 19:59:54 2026 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5F0783A14 for ; Sat, 4 Jul 2026 16:28:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783182489; cv=none; b=K2c7HW3oSTISKCltMwT+EjsTDcmeYV0g1IkSHTO023uALHNLb47OoNCBtadtPjQ8vyCHL7mk/40+ThmzbMLF3p7SzHFpocPXl03Pw+4kAA9AXwR5BGbUoegAtxZghEzj0TBnDzS7sGT9/RvRdLgJLMo/wb2ifZ1QpNDLi72KtPI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783182489; c=relaxed/simple; bh=7UTmneeW9E9E7PXK8lNFL+/iNQ4NYOiuSAMh8vKdEFM=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=H0o6AlC7JkvoYe6/QEY3DzREIs0Mdl0KafWu3XtmIoEMpfGQPKcF2kivVcJslrHe4y0gXF5+8oCn/13rduu2pg5wBf7BqWhh/nvM96QnW17lNN/qYWybETkLviyxt0FmYooW6HsNeme15yQ3uF+xd+KON5ZtehhtNgjNEf7l+GA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YYSo7iGW; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YYSo7iGW" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-845f2909cd4so1056071b3a.1 for ; Sat, 04 Jul 2026 09:28:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1783182487; x=1783787287; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=9OYlBbWfBggrw2pnATd+jWmrmoW3Jn22DaA7suvtDpQ=; b=YYSo7iGWZJ2+gQ+woeryf87pHwxmH0I3rXGhJrd22lHkdyIV1HWG5sOrlaM/USXxAA R91mj3bMM7/4x1EiBKG3ca7lS6ppdTWR+kV2R74cgah1nZ0/ruI2Cvbox/A65Ab6WhUg KEVlMa14yZ4+AMIsGP3OkhYNQ9dEQ5DM9T1R6XtFTfdYplMqP7vywb9aRD7TrERFpkGS IUAxzsmPd1Zmc74OBS6oBL4llp3jSj4ZjgSGTd+FoXXHPccM7V/9mmGAiSE+oj6fJBnw 6Oqb7B1jL4FZM3507w67M4Y3JL2t3wz+wMXCnaiEvH9+OWE03jMKTgOzP27uIgu2xD22 HO/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1783182487; x=1783787287; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9OYlBbWfBggrw2pnATd+jWmrmoW3Jn22DaA7suvtDpQ=; b=n0feEqG2pbfnBlf63p0C85D65/Bp5+WkVe7suvznfzSMoRSstLnGXmYXqPpncZg0OL owyXwXG96cL8DgRqGNETId71/mTn+sm+JALkBzDCFdJrbCoSFTkQfIsKPHlJ6k+e9ts0 AKsuIvXhlQ7cUL440k+0aoPr/u9TjLtn8VN8IjFyg9hZ+eQyUeSmyvRI3myrANdTFjyz kE9jwmigzIEdBZU6iQvM8PxHS1uMHfsdUYPp6dpvlHCf+uRO5nWUPjkj/EKns6GZPw3A BdmqDclVu4SIzXPfUc58D6atpMTBa8yiKg1P+H8FGhEG3nseNK1EjrE0A/oN9oEH8C0H R1Lw== X-Forwarded-Encrypted: i=1; AFNElJ9QOzZnkpaDIU0BEIbrWzFSdGUKN30zAKOkPruTBQLPp8d+VvYl95itG9vkoyOcadXf//kDEm+Nmj0gjlQ=@vger.kernel.org X-Gm-Message-State: AOJu0Ywl5opsLu74wm4zYSB9Hmqiq5O3D8BfvT6DlQ7RMJ+/OxZm6d02 22OMZwygiidErogpqrOcvXic92G42x4QKOZtbKhiGm4lgM0c8JaeViqv X-Gm-Gg: AfdE7clTETqPRVDvXAyVw50x5F38kXMotyrnbbz/PSAGo5p4kxD7lP8BF68kVdx3n2u I8k/k3O+yAFgb2arcS3s1Ys+JcSUhSfbYxA1i6JPNHlkDHRmAvXuxAWWhHpLLwAHnZHqLAHq/fr 2liEQOgrVG5sOpE1LeLftVmuaZaOW2f35eM9wwUt3XdSX4byJ234KxnG+UcXWUY0cVOtSlxryCF neQ3VFn1BoeodADJ7oA85bC8e9xcSlA78VROfOCjSaK38b4lCImwdXlZwebtfodngQ7vv5Mmnkk Xe1ewDH6D9bJefmUrhm2fRXJJjUI9/TcoJ/M8nRlmJ95vIwSP8wibINmCLCL21S4AVu/Tl6e9Z+ SlmB83uCFCkeepYMj0dk863GtykEGcZSakaZnfy7W93kBzzgseHFVg7/bsFePexXlyYIPQYpD/U ZyPPJf18M= X-Received: by 2002:a05:6a00:6c91:b0:847:9188:e492 with SMTP id d2e1a72fcca58-847f6dc6812mr3659467b3a.22.1783182486820; Sat, 04 Jul 2026 09:28:06 -0700 (PDT) Received: from localhost ([111.228.63.84]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-847f6dd9237sm1392842b3a.59.2026.07.04.09.28.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2026 09:28:05 -0700 (PDT) From: Cen Zhang To: Keith Busch , Christoph Hellwig , Sagi Grimberg , Jens Axboe , Steve Wise Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, baijiaju1990@gmail.com, zzzccc427@gmail.com Subject: [PATCH v4] nvme-rdma: hold nvme_rdma_device reference in remove_one Date: Sun, 5 Jul 2026 00:27:57 +0800 Message-Id: <20260704162757.1792325-1-zzzccc427@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nvme_rdma_remove_one() first verifies that an ib_device has an nvme_rdma_device on device_list, but it drops device_list_mutex before it walks nvme_rdma_ctrl_list. It then identifies matching controllers by dereferencing ctrl->device->dev while holding only nvme_rdma_ctrl_mutex. ctrl->device is a cached copy of queue 0's struct nvme_rdma_device. Queue teardown owns that object's kref, so controller list membership does not keep the nvme_rdma_device alive. The buggy scenario involves two paths, with each column showing the order within that path: RDMA remove callback: Controller error recovery: 1. find ndev on device_list 1. run nvme_rdma_error_recovery_work() 2. drop device_list_mutex 2. tear down the admin queue 3. walk nvme_rdma_ctrl_list 3. drop the final queue device ref 4. read ctrl->device->dev 4. free the nvme_rdma_device Fix this by taking a temporary reference to the matching struct nvme_rdma_device while still holding device_list_mutex. The controller walk can then compare ctrl->device directly with the pinned nvme_rdma_device without dereferencing a queue-owned object that might have been freed. Release the temporary reference after the delete workqueue flush. Validation reproduced this kernel report: BUG: KASAN: slab-use-after-free in nvme_rdma_remove_one+0x281/0x2c0 [nvme_rdma] Call Trace: dump_stack_lvl+0x66/0xa0 print_report+0xce/0x630 ? nvme_rdma_remove_one+0x281/0x2c0 [nvme_rdma] ? srso_alias_return_thunk+0x5/0xfbef5 ? __virt_addr_valid+0x20d/0x410 ? nvme_rdma_remove_one+0x281/0x2c0 [nvme_rdma] kasan_report+0xe0/0x110 ? nvme_rdma_remove_one+0x281/0x2c0 [nvme_rdma] nvme_rdma_remove_one+0x281/0x2c0 [nvme_rdma] remove_client_context+0xa9/0xf0 [ib_core] disable_device+0x12d/0x240 [ib_core] ? __pfx_disable_device+0x10/0x10 [ib_core] ? srso_alias_return_thunk+0x5/0xfbef5 ? __mutex_unlock_slowpath+0x147/0x900 __ib_unregister_device+0x26f/0x460 [ib_core] ib_unregister_device_and_put+0x55/0x70 [ib_core] nldev_dellink+0x29e/0x3c0 [ib_core] ? unwind_next_frame+0x6e3/0x2190 ? __pfx_nldev_dellink+0x10/0x10 [ib_core] ? lock_acquire+0x2b8/0x2f0 ? srso_alias_return_thunk+0x5/0xfbef5 ? cap_capable+0x196/0x330 ? __pfx_down_read+0x10/0x10 rdma_nl_rcv_msg+0x2db/0x5f0 [ib_core] ? __pfx_rdma_nl_rcv_msg+0x10/0x10 [ib_core] rdma_nl_rcv_skb.constprop.0.isra.0+0x222/0x380 [ib_core] ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10 [ib_core] ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? netlink_deliver_tap+0x150/0xac0 netlink_unicast+0x47c/0x790 ? __pfx_netlink_unicast+0x10/0x10 netlink_sendmsg+0x767/0xc30 ? __pfx_netlink_sendmsg+0x10/0x10 ? lock_release+0x1e0/0x280 __sys_sendto+0x339/0x390 ? __pfx___sys_sendto+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 __x64_sys_sendto+0xe0/0x1c0 ? do_syscall_64+0x81/0x6a0 ? srso_alias_return_thunk+0x5/0xfbef5 ? trace_hardirqs_on+0x18/0x160 do_syscall_64+0x115/0x6a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Allocated by task 436: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 nvme_rdma_cm_handler+0xcbc/0x2914 [nvme_rdma] cma_cm_event_handler+0xb2/0x390 [rdma_cm] addr_handler+0x199/0x2b0 [rdma_cm] process_one_req+0x113/0x650 [ib_core] process_one_work+0x8d0/0x1870 worker_thread+0x575/0xf80 kthread+0x2e7/0x3c0 ret_from_fork+0x576/0x810 ret_from_fork_asm+0x1a/0x30 Freed by task 436: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x5f/0x80 kfree+0x307/0x580 nvme_rdma_free_dev+0x16d/0x260 [nvme_rdma] nvme_rdma_free_queue+0x6d/0x90 [nvme_rdma] nvme_rdma_error_recovery_work+0x7f/0x110 [nvme_rdma] process_one_work+0x8d0/0x1870 worker_thread+0x575/0xf80 kthread+0x2e7/0x3c0 ret_from_fork+0x576/0x810 ret_from_fork_asm+0x1a/0x30 Fixes: e87a911fed07 ("nvme-rdma: use ib_client API to detect device removal= ") Assisted-by: Codex:gpt-5.5 Signed-off-by: Cen Zhang --- v4: Renamed the subject and commit message wording to refer explicitly to struct nvme_rdma_device instead of the vague "device wrapper" term. v3: Removed the temporary iterator and goto-based lookup in nvme_rdma_remove_one(). Preserved the original found-based device-list flow while pinning the matched nvme_rdma_device. v2: Reworked the fix to take a temporary nvme_rdma_device reference during the device_list lookup instead of adding a cached ib_device field to struct nvme_rdma_ctrl. Changed the controller-list match to compare ctrl->device against the pinned wrapper while preserving the existing delete workqueue flush behavior. drivers/nvme/host/rdma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 6909e3542794..663e38540073 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2378,7 +2378,7 @@ static void nvme_rdma_remove_one(struct ib_device *ib= _device, void *client_data) mutex_lock(&device_list_mutex); list_for_each_entry(ndev, &device_list, entry) { if (ndev->dev =3D=3D ib_device) { - found =3D true; + found =3D nvme_rdma_dev_get(ndev); break; } } @@ -2390,13 +2390,14 @@ static void nvme_rdma_remove_one(struct ib_device *= ib_device, void *client_data) /* Delete all controllers using this device */ mutex_lock(&nvme_rdma_ctrl_mutex); list_for_each_entry(ctrl, &nvme_rdma_ctrl_list, list) { - if (ctrl->device->dev !=3D ib_device) + if (ctrl->device !=3D ndev) continue; nvme_delete_ctrl(&ctrl->ctrl); } mutex_unlock(&nvme_rdma_ctrl_mutex); =20 flush_workqueue(nvme_delete_wq); + nvme_rdma_dev_put(ndev); } =20 static struct ib_client nvme_rdma_ib_client =3D { --=20 2.43.0