From nobody Mon Jun 22 14:25:01 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C904C433F5 for ; Tue, 22 Mar 2022 13:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235312AbiCVNSz (ORCPT ); Tue, 22 Mar 2022 09:18:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235271AbiCVNSr (ORCPT ); Tue, 22 Mar 2022 09:18:47 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E54162BC7; Tue, 22 Mar 2022 06:17:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647955039; x=1679491039; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Cpk+shOeGdSGFqEXVzHLawI+cKG3RcvZqECFgMwDU5I=; b=d2WCe2z3FEAnSCzoYyq/s3kNDhNt25aqK/HE/JO1F0lFd76g/EXlDR+l U5MZVV6rGxknHVbRFJqUdfupNuBG6rfhhMLpw+C+gul1nq1o+/xDMadVe yAI/5olYZbpZ+Vf1/qrfsIQyMofOZLEZWL56CJujXOpsCXRVDK02OwlQO CTc5ZXSt1idl/ZxAcePSu7cRRx3+8LAUZI4NR+MuoFr2f9255u5pVjp9h KmUre6zPFY+X9Iq/XR8vEUgz0/29DyGR8GL0XCMtghBM0WnGTk7v1k5sb mx2x0MXfIoZYouXwORfkjDgFpUIr9wm03pwhK5xo2KVI2ksKL1bAOH4iQ w==; X-IronPort-AV: E=McAfee;i="6200,9189,10293"; a="237755602" X-IronPort-AV: E=Sophos;i="5.90,201,1643702400"; d="scan'208";a="237755602" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2022 06:17:19 -0700 X-IronPort-AV: E=Sophos;i="5.90,201,1643702400"; d="scan'208";a="543686439" Received: from zq-optiplex-7090.bj.intel.com ([10.238.156.125]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2022 06:17:17 -0700 From: Zqiang To: paulmck@kernel.org, frederic@kernel.org Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/2] rcu: Call rcu_nocb_rdp_deoffload() directly after rcuog/op kthreads spawn failed Date: Tue, 22 Mar 2022 21:17:52 +0800 Message-Id: <20220322131753.1680329-2-qiang1.zhang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220322131753.1680329-1-qiang1.zhang@intel.com> References: <20220322131753.1680329-1-qiang1.zhang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If the rcuog/op kthreads spawn failed, the offload rdp need to be deoffload, otherwise because target rdp is considered offloaded but nothing actually handles the callbacks. Signed-off-by: Zqiang --- kernel/rcu/tree_nocb.h | 77 +++++++++++++++++++++++++++++++++--------- 1 file changed, 61 insertions(+), 16 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 46694e13398a..154934f3daa9 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -972,10 +972,7 @@ static int rdp_offload_toggle(struct rcu_data *rdp, } raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); =20 - if (wake_gp) - wake_up_process(rdp_gp->nocb_gp_kthread); - - return 0; + return wake_gp; } =20 static long rcu_nocb_rdp_deoffload(void *arg) @@ -983,9 +980,18 @@ static long rcu_nocb_rdp_deoffload(void *arg) struct rcu_data *rdp =3D arg; struct rcu_segcblist *cblist =3D &rdp->cblist; unsigned long flags; - int ret; + int wake_gp; + struct rcu_data *rdp_gp =3D rdp->nocb_gp_rdp; + int condition; =20 - WARN_ON_ONCE(rdp->cpu !=3D raw_smp_processor_id()); + /* + * The rcu_nocb_rdp_deoffload() will be called directly when + * rcuog/op spawn failed, because at this time the rdp->cpu + * is not online(cpu_online(rdp->cpu) return false), the deoffload + * operation was not performed on rdp->cpu, to avoid warnings + * add cpu_online(rdp->cpu) condition judgment. + */ + WARN_ON_ONCE((rdp->cpu !=3D raw_smp_processor_id()) && cpu_online(rdp->cp= u)); =20 pr_info("De-offloading %d\n", rdp->cpu); =20 @@ -1009,10 +1015,35 @@ static long rcu_nocb_rdp_deoffload(void *arg) */ rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE); invoke_rcu_core(); - ret =3D rdp_offload_toggle(rdp, false, flags); - swait_event_exclusive(rdp->nocb_state_wq, - !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB | - SEGCBLIST_KTHREAD_GP)); + wake_gp =3D rdp_offload_toggle(rdp, false, flags); + + mutex_lock(&rdp_gp->nocb_gp_kthread_mutex); + if (rdp_gp->nocb_gp_kthread) { + if (wake_gp) + wake_up_process(rdp_gp->nocb_gp_kthread); + + if (rdp->nocb_cb_kthread) { + condition =3D SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP; + } else { + /* + *If rcuop kthread spawn failed, direct remove SEGCBLIST_KTHREAD_CB + *just wait SEGCBLIST_KTHREAD_GP to be cleared. + */ + condition =3D SEGCBLIST_KTHREAD_GP; + rcu_nocb_lock_irqsave(rdp, flags); + rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_CB); + rcu_nocb_unlock_irqrestore(rdp, flags); + } + swait_event_exclusive(rdp->nocb_state_wq, + !rcu_segcblist_test_flags(cblist, condition)); + } else { + rcu_nocb_lock_irqsave(rdp, flags); + rcu_segcblist_clear_flags(cblist, + SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP); + rcu_nocb_unlock_irqrestore(rdp, flags); + } + mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); + /* Stop nocb_gp_wait() from iterating over this structure. */ list_del_rcu(&rdp->nocb_entry_rdp); /* @@ -1035,7 +1066,7 @@ static long rcu_nocb_rdp_deoffload(void *arg) WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); =20 =20 - return ret; + return 0; } =20 int rcu_nocb_cpu_deoffload(int cpu) @@ -1067,7 +1098,8 @@ static long rcu_nocb_rdp_offload(void *arg) struct rcu_data *rdp =3D arg; struct rcu_segcblist *cblist =3D &rdp->cblist; unsigned long flags; - int ret; + int wake_gp; + struct rcu_data *rdp_gp =3D rdp->nocb_gp_rdp; =20 WARN_ON_ONCE(rdp->cpu !=3D raw_smp_processor_id()); /* @@ -1077,6 +1109,9 @@ static long rcu_nocb_rdp_offload(void *arg) if (!rdp->nocb_gp_rdp) return -EINVAL; =20 + if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread)) + return -EINVAL; + pr_info("Offloading %d\n", rdp->cpu); =20 /* @@ -1111,7 +1146,9 @@ static long rcu_nocb_rdp_offload(void *arg) * WRITE flags READ callbacks * rcu_nocb_unlock() rcu_nocb_unlock() */ - ret =3D rdp_offload_toggle(rdp, true, flags); + wake_gp =3D rdp_offload_toggle(rdp, true, flags); + if (wake_gp) + wake_up_process(rdp_gp->nocb_gp_kthread); swait_event_exclusive(rdp->nocb_state_wq, rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB) && rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)); @@ -1124,7 +1161,7 @@ static long rcu_nocb_rdp_offload(void *arg) rcu_segcblist_clear_flags(cblist, SEGCBLIST_RCU_CORE); rcu_nocb_unlock_irqrestore(rdp, flags); =20 - return ret; + return 0; } =20 int rcu_nocb_cpu_offload(int cpu) @@ -1246,7 +1283,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu) "rcuog/%d", rdp_gp->cpu); if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is no= w expected behavior\n", __func__)) { mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); - return; + goto end; } WRITE_ONCE(rdp_gp->nocb_gp_kthread, t); if (kthread_prio) @@ -1258,12 +1295,20 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu) t =3D kthread_run(rcu_nocb_cb_kthread, rdp, "rcuo%c/%d", rcu_state.abbr, cpu); if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now= expected behavior\n", __func__)) - return; + goto end; =20 if (kthread_prio) sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); WRITE_ONCE(rdp->nocb_cb_kthread, t); WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); + return; +end: + mutex_lock(&rcu_state.barrier_mutex); + if (rcu_rdp_is_offloaded(rdp)) { + rcu_nocb_rdp_deoffload(rdp); + cpumask_clear_cpu(cpu, rcu_nocb_mask); + } + mutex_unlock(&rcu_state.barrier_mutex); } =20 /* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids)= . */ --=20 2.25.1 From nobody Mon Jun 22 14:25:01 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B83D9C4332F for ; Tue, 22 Mar 2022 13:17:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235296AbiCVNSv (ORCPT ); Tue, 22 Mar 2022 09:18:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235280AbiCVNSs (ORCPT ); Tue, 22 Mar 2022 09:18:48 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BC2885BF2; Tue, 22 Mar 2022 06:17:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647955041; x=1679491041; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uMUVcf+gEIjrdXadMEnObr5jiYSkzHM/l0NttbKfDx4=; b=AsQbe+Vm61YQ1EK7seYK6Y34B2GNZ9+cnRk3H/o2u5qeGHHOk7Orj0L+ 1bFULHxQ29uZ45UdB5+3otlp5dwO/2q7ibB2d/8IKlPxXUqFvv/jWYyCj x33yDQ0g7lqn22CzmRgsffIrtGnJygoAC7PpOiW1x1f8484jSvdq8SRCQ 159Ndfu25gQO4QIVtC1RscMgv2IFFGtPmtlpjJJttsP3J4K7qTPGKzFzk eiGiztvV6MvHCT3GaQkTtV7LWLiXEyGIusb+bSYrmRFezyohFMECJ03U5 WHcCiIXBz4RTGmw1grSNDFQB9k5dlqgLB5sXQrUNLMB1/ZrUYVitxI2UR A==; X-IronPort-AV: E=McAfee;i="6200,9189,10293"; a="237755609" X-IronPort-AV: E=Sophos;i="5.90,201,1643702400"; d="scan'208";a="237755609" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2022 06:17:20 -0700 X-IronPort-AV: E=Sophos;i="5.90,201,1643702400"; d="scan'208";a="543686441" Received: from zq-optiplex-7090.bj.intel.com ([10.238.156.125]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2022 06:17:19 -0700 From: Zqiang To: paulmck@kernel.org, frederic@kernel.org Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/2] rcu: Invert the locking dependency order between rcu_state.barrier_mutex and hotplug lock Date: Tue, 22 Mar 2022 21:17:53 +0800 Message-Id: <20220322131753.1680329-3-qiang1.zhang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220322131753.1680329-1-qiang1.zhang@intel.com> References: <20220322131753.1680329-1-qiang1.zhang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When call rcutree_prepare_cpu(), the cpus write lock has been held, just lock the barrier_mutex before calling rcu_nocb_rdp_deoffload() from failure path. therefore, invert the locking dependency order. Signed-off-by: Zqiang --- kernel/rcu/tree_nocb.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 154934f3daa9..e3d1bd26d6eb 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1074,8 +1074,8 @@ int rcu_nocb_cpu_deoffload(int cpu) struct rcu_data *rdp =3D per_cpu_ptr(&rcu_data, cpu); int ret =3D 0; =20 - mutex_lock(&rcu_state.barrier_mutex); cpus_read_lock(); + mutex_lock(&rcu_state.barrier_mutex); if (rcu_rdp_is_offloaded(rdp)) { if (cpu_online(cpu)) { ret =3D work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); @@ -1086,8 +1086,8 @@ int rcu_nocb_cpu_deoffload(int cpu) ret =3D -EINVAL; } } - cpus_read_unlock(); mutex_unlock(&rcu_state.barrier_mutex); + cpus_read_unlock(); =20 return ret; } @@ -1169,8 +1169,8 @@ int rcu_nocb_cpu_offload(int cpu) struct rcu_data *rdp =3D per_cpu_ptr(&rcu_data, cpu); int ret =3D 0; =20 - mutex_lock(&rcu_state.barrier_mutex); cpus_read_lock(); + mutex_lock(&rcu_state.barrier_mutex); if (!rcu_rdp_is_offloaded(rdp)) { if (cpu_online(cpu)) { ret =3D work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); @@ -1181,8 +1181,8 @@ int rcu_nocb_cpu_offload(int cpu) ret =3D -EINVAL; } } - cpus_read_unlock(); mutex_unlock(&rcu_state.barrier_mutex); + cpus_read_unlock(); =20 return ret; } --=20 2.25.1