From nobody Fri Nov 29 21:43:54 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EC891581E5; Fri, 13 Sep 2024 21:42:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263742; cv=none; b=h1rKMdF3Am8h0clc6s+rBPJm1q2NfnYPlbvzkuQKBfjLLOuHYdo/FYHg++FdDmvX4vPhkAKOs5F+Cuapzzmbjj6+gYAXPv9YxlQkiLSWrAUTI/XwVnEz9frTIQGDr8k1AJOvRHv7SwErv+KWdyhW+wjHc3VSZDAgTtmu/D55w80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263742; c=relaxed/simple; bh=//d0b/5lpY1r0BFbIu3UeB0f6ghuzBF9x7spSDqN2Pc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dda45i9Vy2IcT3gKP85SThsFvGkAAzWVV+qVdlOTqXr097h1ak1YxyW2vUPsqvZ0tjJkCHXWRmE/8L95cnmobemIQGd8G9JnWre2vJv0LcojQ20XW9mUDeuVWgCUEF9lJ+LU7Byee3zwi3WMafKjKPu+ukj5oqpVtQ+DVmN4jxQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UpMqwhR8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UpMqwhR8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56258C4CEC7; Fri, 13 Sep 2024 21:42:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726263741; bh=//d0b/5lpY1r0BFbIu3UeB0f6ghuzBF9x7spSDqN2Pc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UpMqwhR8euCD9pGcO4LBIwD+FcTAs21nE7iVkR5vZb95aYLV4rxt1Um51ZBeEuqfB fjv4Mdc2dWUwfbae4sKPl23JJzyK1UWTbrhpLSNtfIf3lEdeh9jJozlZa6ws36c0Qm jdJmsV8ZraMvvrmm5eOZ1rx0fX4dPjRyrMwsds6yAOwvDYHmnp6YXaV/A7HoJwzhkI VyzVnORiOs5/N+caNji+WIOP4Ss/mk6wvbwkdZwLna2YyznKjyJ915I2OLANJ/fpS8 0GXrNeuNAuGdrlleWiQjOmrl9cDOl3bs6QZiJ8DjJuBekHr2lXfZQL9+OMn9eZxe8C 4G0N3p6gGYFMQ== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu , Cheng-Jui Wang Subject: [PATCH 1/3] rcu/nocb: Fix RT throttling hrtimer armed from offline CPU Date: Fri, 13 Sep 2024 23:42:03 +0200 Message-ID: <20240913214205.12359-2-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240913214205.12359-1-frederic@kernel.org> References: <20240913214205.12359-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable After a CPU is marked offline and until it reaches its final trip to idle, rcuo has several opportunities to be woken up, either because a callback has been queued in the meantime or because rcutree_report_cpu_dead() has issued the final deferred NOCB wake up. If RCU-boosting is enabled, RCU kthreads are set to SCHED_FIFO policy. And if RT-bandwidth is enabled, the related hrtimer might be armed. However this then happens after hrtimers have been migrated at the CPUHP_AP_HRTIMERS_DYING stage, which is broken as reported by the following warning: Call trace: enqueue_hrtimer+0x7c/0xf8 hrtimer_start_range_ns+0x2b8/0x300 enqueue_task_rt+0x298/0x3f0 enqueue_task+0x94/0x188 ttwu_do_activate+0xb4/0x27c try_to_wake_up+0x2d8/0x79c wake_up_process+0x18/0x28 __wake_nocb_gp+0x80/0x1a0 do_nocb_deferred_wakeup_common+0x3c/0xcc rcu_report_dead+0x68/0x1ac cpuhp_report_idle_dead+0x48/0x9c do_idle+0x288/0x294 cpu_startup_entry+0x34/0x3c secondary_start_kernel+0x138/0x158 Fix this with waking up rcuo using an IPI if necessary. Since the existing API to deal with this situation only handles swait queue, rcuo is only woken up from offline CPUs if it's not already waiting on a grace period. In the worst case some callbacks will just wait for a grace period to complete before being assigned to a subsequent one. Reported-by: Cheng-Jui Wang (=E7=8E=8B=E6=AD=A3=E7=9D=BF) Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CP= U earlier") Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 1e92164116ef..08608fe1792c 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -216,7 +216,10 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) { trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); - wake_up_process(rdp_gp->nocb_gp_kthread); + if (cpu_is_offline(raw_smp_processor_id())) + swake_up_one_online(&rdp_gp->nocb_gp_wq); + else + wake_up_process(rdp_gp->nocb_gp_kthread); } =20 return needwake; --=20 2.45.2 From nobody Fri Nov 29 21:43:54 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01A7A6BFD4; Fri, 13 Sep 2024 21:42:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263746; cv=none; b=kZ0zFhZshgg87dGtVSg9RkJ+srVOGa0fta//6uwzAboIChv1z09mOa/zVStcJsIq2ad2yCcah820WuaZ/oJd0A5yycaDx8gkOvapBXHcVoS4IiGGuvVRWHveaeWHD8PDqAhTiX4dQP6D+LfYHykP1+ijfvFMsUzq7E8Z14SL2Bo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263746; c=relaxed/simple; bh=8ABdp/mvtv583Z/CXv4BFc9AuImk05ra5ARaYYVqd+4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V5ITh6wDaV0y2hngRxFFtDGrIJIVCmRGip5oD+Lt3MrgT+MobrtNoYZ+lb8cQXYGkaST4Y3taPePk69uWwjIgQd8wgZ27jPGmmHnJfEN64C8ROU/OECcFLVbcJ4KfYUm8C+DitMI0JuxsM01dM1ZK6LDfvGWw6rN4atCe1oOGxg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pQNxOZth; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pQNxOZth" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1115AC4CEC0; Fri, 13 Sep 2024 21:42:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726263744; bh=8ABdp/mvtv583Z/CXv4BFc9AuImk05ra5ARaYYVqd+4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pQNxOZth1QE2HpMgR2WG0aU1j0YstxIY48N6MQX4zthG6BQ+u0AZqnxq1cU7Fx3QI ABPD0l9GUmDLbb5FuB90ZnI2len47zo0GWmo8tJwNuXHymA8xqeb038a2AlhtFGkps rnmd3XyJtU/ige1Q0HI+akYaEx9O4AjZPwRh3sSkPwWmIs+EVctSVN010aHIefDSeP qmvJ0zlZmPkm/+XhvROc5EaadgTtUa34CryIexjy9GuJciO6FxbSYRnQF0xsbYrjnU Q6G1C9JG+y1HVHrcft1uH7FTAXTyxnZj0aN27uE7vdWW2Mnw0oSTjbOk5zi8Hmhsfa tiRXQx5WG9OwA== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu , Cheng-Jui Wang Subject: [PATCH 2/3] rcu/nocb: Conditionally wake up rcuo if not already waiting on GP Date: Fri, 13 Sep 2024 23:42:04 +0200 Message-ID: <20240913214205.12359-3-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240913214205.12359-1-frederic@kernel.org> References: <20240913214205.12359-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A callback enqueuer currently wakes up the rcuo kthread if it is adding the first non-done callback of a CPU, whether the kthread is waiting on a grace period or not (unless the CPU is offline). This looks like a desired behaviour because then the rcuo kthread doesn't wait for the end of the current grace period to handle the callback. It is accelerated right away and assigned to the next grace period. The GP kthread is notified about that fact and iterates with the upcoming GP without sleeping in-between. However this best-case scenario is contradicted by a few details, depending on the situation: 1) If the callback is a non-bypass one queued with IRQs enabled, the wake up only occurs if no other pending callbacks are on the list. Therefore the theoretical "optimization" actually applies on rare occasions. 2) If the callback is a non-bypass one queued with IRQs disabled, the situation is similar with even more uncertainty due to the deferred wake up. 3) If the callback is lazy, a few jiffies don't make any difference. 4) If the callback is bypass, the wake up timer is programmed 2 jiffies ahead by rcuo in case the regular pending queue has been handled in the meantime. The rare storm of callbacks can otherwise wait for the currently elapsing grace period to be flushed and handled. For all those reasons, the optimization is only theoretical and occasional. Therefore it is reasonable that callbacks enqueuers only wake up the rcuo kthread when it is not already waiting on a grace period to complete. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 08608fe1792c..0c9eca1cc76e 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -216,10 +216,7 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) { trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); - if (cpu_is_offline(raw_smp_processor_id())) - swake_up_one_online(&rdp_gp->nocb_gp_wq); - else - wake_up_process(rdp_gp->nocb_gp_kthread); + swake_up_one_online(&rdp_gp->nocb_gp_wq); } =20 return needwake; --=20 2.45.2 From nobody Fri Nov 29 21:43:54 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61EDC1741D2; Fri, 13 Sep 2024 21:42:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263747; cv=none; b=Ft8QJgycsA8mfmGbvH3WBSb7SBKXyrFOTXRIUGjDbzPnu+tqmhzLv9937Apkm0NMAZO8ony5emgWEpih9LpQCuc5vm6PMLIyHceIqX5JFEXHv1gIWfUCg1LKxjzKe+mkPHHXaB0ip31PoODwgiyFFRg0td/+YGBuh80l7WelaCE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726263747; c=relaxed/simple; bh=uzSPe7hDnBfbvVFiLqJZzljHfbGcKr4WlCE9pNxHl94=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cK+7ZSv1tNwQfTIBXZzRO5fSTHBloTvW73hnPaXZdzOaTsjkW397ak9U4ziRs9Li3heM7+0poG4iNLmRFewo5ssUkoyErYSoOCIhp1NcN6I0w+uzqfHl6dKeSZ/B9TwAb7Z3q5iN9Pum/CHw2FzqcFvaHJDYTuxJhN4shLmGCvM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GGJn/uDD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GGJn/uDD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E7C9CC4CECF; Fri, 13 Sep 2024 21:42:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726263747; bh=uzSPe7hDnBfbvVFiLqJZzljHfbGcKr4WlCE9pNxHl94=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GGJn/uDDzv+YCESfMg2kuj8ZSSPBlrR4FFRWm31QOST+//BOtPIw5aGIrvWlQerO9 GG6h8uOpnyRcEf9t9p7mnhIzphMryVSHpmQCJzg8fgKNhTiSHl7u+8ip4yLJb3eUjd s3kQzVV9WB6cwUUY+Gsp8/c55IAvqQwmTtyYdH7bA6kPNu2vAtXKdhE786K3GMcJQC irLX22nZDagT3IvDRz9lkY0Ae8yjHP/8/4PEuVr3UGW9fbG4I+ES/o5mYaVwxysIHL TDvZtu54gI5I8FTm9Zx4W8mCqw7N6Qnk+ss6LTmBYgUiOraIrW0rKPpn1uN12GCOM+ IAzVXqWapGGWA== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Boqun Feng , Joel Fernandes , Neeraj Upadhyay , "Paul E . McKenney" , Uladzislau Rezki , Zqiang , rcu Subject: [PATCH 3/3] rcu/nocb: Remove superfluous memory barrier after bypass enqueue Date: Fri, 13 Sep 2024 23:42:05 +0200 Message-ID: <20240913214205.12359-4-frederic@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240913214205.12359-1-frederic@kernel.org> References: <20240913214205.12359-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pre-GP accesses performed by the update side must be ordered against post-GP accesses performed by the readers. This is ensured by the bypass or nocb locking on enqueue time, followed by the fully ordered rnp locking initiated while callbacks are accelerated, and then propagated throughout the whole GP lifecyle associated with the callbacks. Therefore the explicit barrier advertizing ordering between bypass enqueue and rcuo wakeup is superfluous. If anything, it would even only order the first bypass callback enqueue against the rcuo wakeup and ignore all the subsequent ones. Remove the needless barrier. Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree_nocb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 0c9eca1cc76e..755ada098035 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -493,7 +493,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, s= truct rcu_head *rhp, trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); } rcu_nocb_bypass_unlock(rdp); - smp_mb(); /* Order enqueue before wake. */ + // A wake up of the grace period kthread or timer adjustment // needs to be done only if: // 1. Bypass list was fully empty before (this is the first --=20 2.45.2