From nobody Fri Dec 19 15:59:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04B1AC83F1B for ; Tue, 29 Aug 2023 09:14:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234482AbjH2JOS (ORCPT ); Tue, 29 Aug 2023 05:14:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234495AbjH2JNs (ORCPT ); Tue, 29 Aug 2023 05:13:48 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5659618D for ; Tue, 29 Aug 2023 02:13:41 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 05CBC21868; Tue, 29 Aug 2023 09:13:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1693300420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tbnmeXqaEy+q0x7z5j9bg8512I9Cw52vyXZSyNRCyQs=; b=Byp/JQdjZlwtUrTt+pz4U9LPs9CX1CH3jPaE7wG/i2MGQGK5wZKC60tDdGLthjQQUt3+Zo 1M0s6mQTReANnTfF1qsjluPUbDBKgB77RRW6asR+JFMeTgsCAAVKsv6yZikPsHVmUOmz6S f97BLSyw67Fpg5AO+M+agsIvUJI4XpI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1693300420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tbnmeXqaEy+q0x7z5j9bg8512I9Cw52vyXZSyNRCyQs=; b=Yg0tbqWx3cZ/OYwzYMgTD5YdiaRZQpS0EPBdj09gmJiewF/5XK+njonGN2VLt9Vkae8vLH 4B2tA4bODzhnrNBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id EB63E13301; Tue, 29 Aug 2023 09:13:39 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 469wOcO27WS0UwAAMHmgww (envelope-from ); Tue, 29 Aug 2023 09:13:39 +0000 From: Daniel Wagner To: linux-nvme@lists.infradead.org Cc: linux-kernel@vger.kernel.org, Hannes Reinecke , Sagi Grimberg , Jason Gunthorpe , James Smart , Chaitanya Kulkarni , Christoph Hellwig , Daniel Wagner Subject: [RFC v1 3/4] nvmet-fc: untangle cross refcounting objects Date: Tue, 29 Aug 2023 11:13:48 +0200 Message-ID: <20230829091350.16156-4-dwagner@suse.de> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829091350.16156-1-dwagner@suse.de> References: <20230829091350.16156-1-dwagner@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Associations take a refcount on queues, queues take a refcount on associations. The existing code lead to the situation that the target executes a disconnect and the host retriggers a reconnect immediately. The reconnect command still finds an existing association and uses this. Though the reconnect crashes later on because nvmet_fc_delete_target_assoc() blindly goes ahead and removes resources while the reconnect code wants to use it. The problem is that nvmet_fc_find_target_assoc() is able to lookup an association which is beeing removed. So the first thing to address nvmet_fc_find_target_queue() is to remove the association out of the list and wait a RCU cycle and free resources in the free function callback of the kref_put(). The live time of the queues are strictly bound to the lifetime of an association. Thus we don't need to take reverse refcounts (queue -> assocation). Furthermore, streamline the cleanup code by using the workqueue for delete the association in nvmet_fc_ls_disconnect. This ensures, that we run throught the same shutdown path in all non error cases. Reproducer: nvme/003 Signed-off-by: Daniel Wagner --- drivers/nvme/target/fc.c | 67 ++++++++++++++++++++-------------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c index df7d84aff843..9d7262a8e3db 100644 --- a/drivers/nvme/target/fc.c +++ b/drivers/nvme/target/fc.c @@ -165,6 +165,7 @@ struct nvmet_fc_tgt_assoc { struct nvmet_fc_hostport *hostport; struct nvmet_fc_ls_iod *rcv_disconn; struct list_head a_list; + struct nvmet_fc_tgt_queue *_queues[NVMET_NR_QUEUES + 1]; struct nvmet_fc_tgt_queue __rcu *queues[NVMET_NR_QUEUES + 1]; struct kref ref; struct work_struct del_work; @@ -802,14 +803,11 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc= *assoc, if (!queue) return NULL; =20 - if (!nvmet_fc_tgt_a_get(assoc)) - goto out_free_queue; - queue->work_q =3D alloc_workqueue("ntfc%d.%d.%d", 0, 0, assoc->tgtport->fc_target_port.port_num, assoc->a_id, qid); if (!queue->work_q) - goto out_a_put; + goto out_free_queue; =20 queue->qid =3D qid; queue->sqsize =3D sqsize; @@ -830,7 +828,8 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *= assoc, if (ret) goto out_fail_iodlist; =20 - WARN_ON(assoc->queues[qid]); + WARN_ON(assoc->_queues[qid]); + assoc->_queues[qid] =3D queue; rcu_assign_pointer(assoc->queues[qid], queue); =20 return queue; @@ -838,8 +837,6 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *= assoc, out_fail_iodlist: nvmet_fc_destroy_fcp_iodlist(assoc->tgtport, queue); destroy_workqueue(queue->work_q); -out_a_put: - nvmet_fc_tgt_a_put(assoc); out_free_queue: kfree(queue); return NULL; @@ -852,12 +849,8 @@ nvmet_fc_tgt_queue_free(struct kref *ref) struct nvmet_fc_tgt_queue *queue =3D container_of(ref, struct nvmet_fc_tgt_queue, ref); =20 - rcu_assign_pointer(queue->assoc->queues[queue->qid], NULL); - nvmet_fc_destroy_fcp_iodlist(queue->assoc->tgtport, queue); =20 - nvmet_fc_tgt_a_put(queue->assoc); - destroy_workqueue(queue->work_q); =20 kfree_rcu(queue, rcu); @@ -1100,6 +1093,11 @@ nvmet_fc_delete_assoc(struct work_struct *work) container_of(work, struct nvmet_fc_tgt_assoc, del_work); =20 nvmet_fc_delete_target_assoc(assoc); + + /* release get taken in nvmet_fc_find_target_assoc */ + nvmet_fc_tgt_a_put(assoc); + + /* final reference from nvmet_fc_ls_create_association */ nvmet_fc_tgt_a_put(assoc); } =20 @@ -1172,13 +1170,18 @@ nvmet_fc_target_assoc_free(struct kref *ref) struct nvmet_fc_tgtport *tgtport =3D assoc->tgtport; struct nvmet_fc_ls_iod *oldls; unsigned long flags; + int i; + + for (i =3D NVMET_NR_QUEUES; i >=3D 0; i--) { + if (assoc->_queues[i]) + nvmet_fc_delete_target_queue(assoc->_queues[i]); + } =20 /* Send Disconnect now that all i/o has completed */ nvmet_fc_xmt_disconnect_assoc(assoc); =20 nvmet_fc_free_hostport(assoc->hostport); spin_lock_irqsave(&tgtport->lock, flags); - list_del_rcu(&assoc->a_list); oldls =3D assoc->rcv_disconn; spin_unlock_irqrestore(&tgtport->lock, flags); /* if pending Rcv Disconnect Association LS, send rsp now */ @@ -1208,7 +1211,6 @@ static void nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_assoc *assoc) { struct nvmet_fc_tgtport *tgtport =3D assoc->tgtport; - struct nvmet_fc_tgt_queue *queue; int i, terminating; =20 terminating =3D atomic_xchg(&assoc->terminating, 1); @@ -1217,29 +1219,21 @@ nvmet_fc_delete_target_assoc(struct nvmet_fc_tgt_as= soc *assoc) if (terminating) return; =20 + /* prevent new I/Os entering the queues */ + for (i =3D NVMET_NR_QUEUES; i >=3D 0; i--) + rcu_assign_pointer(assoc->queues[i], NULL); + list_del_rcu(&assoc->a_list); + synchronize_rcu(); =20 + /* ensure all in-flight I/Os have been processed */ for (i =3D NVMET_NR_QUEUES; i >=3D 0; i--) { - rcu_read_lock(); - queue =3D rcu_dereference(assoc->queues[i]); - if (!queue) { - rcu_read_unlock(); - continue; - } - - if (!nvmet_fc_tgt_q_get(queue)) { - rcu_read_unlock(); - continue; - } - rcu_read_unlock(); - nvmet_fc_delete_target_queue(queue); - nvmet_fc_tgt_q_put(queue); + if (assoc->_queues[i]) + flush_workqueue(assoc->_queues[i]->work_q); } =20 dev_info(tgtport->dev, "{%d:%d} Association deleted\n", tgtport->fc_target_port.port_num, assoc->a_id); - - nvmet_fc_tgt_a_put(assoc); } =20 static struct nvmet_fc_tgt_assoc * @@ -1497,6 +1491,8 @@ __nvmet_fc_free_assocs(struct nvmet_fc_tgtport *tgtpo= rt) nvmet_fc_tgt_a_put(assoc); } rcu_read_unlock(); + + flush_workqueue(nvmet_wq); } =20 /** @@ -1870,9 +1866,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtpo= rt, sizeof(struct fcnvme_ls_disconnect_assoc_acc)), FCNVME_LS_DISCONNECT_ASSOC); =20 - /* release get taken in nvmet_fc_find_target_assoc */ - nvmet_fc_tgt_a_put(assoc); - /* * The rules for LS response says the response cannot * go back until ABTS's have been sent for all outstanding @@ -1887,8 +1880,6 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtpo= rt, assoc->rcv_disconn =3D iod; spin_unlock_irqrestore(&tgtport->lock, flags); =20 - nvmet_fc_delete_target_assoc(assoc); - if (oldls) { dev_info(tgtport->dev, "{%d:%d} Multiple Disconnect Association LS's " @@ -1904,6 +1895,11 @@ nvmet_fc_ls_disconnect(struct nvmet_fc_tgtport *tgtp= ort, nvmet_fc_xmt_ls_rsp(tgtport, oldls); } =20 + if (!queue_work(nvmet_wq, &assoc->del_work)) { + /* already deleting - release local reference */ + nvmet_fc_tgt_a_put(assoc); + } + return false; } =20 @@ -2933,6 +2929,9 @@ static int __init nvmet_fc_init_module(void) =20 static void __exit nvmet_fc_exit_module(void) { + /* ensure any shutdown operation, e.g. delete ctrls have finished */ + flush_workqueue(nvmet_wq); + /* sanity check - all lports should be removed */ if (!list_empty(&nvmet_fc_target_list)) pr_warn("%s: targetport list not empty\n", __func__); --=20 2.41.0