From nobody Thu Apr  2 19:49:59 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A07EBC6FA82
	for <linux-kernel@archiver.kernel.org>; Wed, 21 Sep 2022 01:27:27 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230345AbiIUB1Y (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 20 Sep 2022 21:27:24 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34398 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231574AbiIUB1A (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 20 Sep 2022 21:27:00 -0400
Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com
 [IPv6:2607:f8b0:4864:20::649])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F2D45756C
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:12 -0700 (PDT)
Received: by mail-pl1-x649.google.com with SMTP id
 h11-20020a170902f54b00b001780f0f7ea7so2788572plf.9
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date;
        bh=Zc82LmVew5/T1CkEfW5rs6gkMi/t3IMOs+9o8CakWjo=;
        b=Fg1NZ+GWt94sDG8qDDqXcoZpGfdn5EwhJKMaEkIvUBW+dfLv4ocTBHL/P2LHHuDHVf
         +YYIXUTg3zmTr0pzM5fVxhctKGC9WYxIaY/0zXV2wqIVU3wE2vLSLHvFfdTJU+2gfq0L
         voLAPO6S+ieDU8UHyW94wVMreNyGBaq/k3bkQM0qGa/I83t1jC7cd4PvfPpYCIyMbQbM
         J1faxAgvLya80zAXG8IitYwy+rlkNXA9wIh7F0b0Q3FWarAz6K3tolu8CIqvXHmDRAHI
         cEFHuUIkYjKgH2XIPoGgrLcouzQuWvfLyEyYUvVmNJOSTdu8TB9CmkwRkMndLAO0Pk3w
         s/2Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date;
        bh=Zc82LmVew5/T1CkEfW5rs6gkMi/t3IMOs+9o8CakWjo=;
        b=ZEFaYwG/BpJZC+/gefyitmzJPz8WRy0l/qrKaedBd+/nOkgtUtkDlLzy4lw1vF3lin
         NWLR/IOK86pQYkqB0M6OnlaTkIMYWi9ORbyDHITGS3a/utIOhtQZ7mwj5LPQH2UPefQm
         Nqrna7JF+Ol0ToVsHgEqwQlzJLulLg1xApDccTfUfok6ACakd88D9Kb89A27HrlHwbfX
         O8XQYGciGNtQGOU9Xf5MsYUHruvFaYRuagf1uaZpEPpy21Ya6oc4WVHA/FWl5l+rAmLp
         VNzJHCe/l6p+GqG0gHZf1aO1unx0fGTPRp3Tobg6OdbxL07jJyeRtKF3NFPaX0HAbEJi
         PmdA==
X-Gm-Message-State: ACrzQf0ZvXC03w2sy5/SXmOWr2L6Xm/5JpozfHnSO8K7r+WrG2t3Ttxl
        cFcJok3yDMNLjPuhwCz6ZrPDzu6QoSsD06iVzgnH41h1Qd4LiGmYACHGlol3S99j1c4TeibqdRT
        glYrjepmpVkbzgHVgjEbJjjcEh7qrVj86wxA8jBQmaMFzpHWolRDgLVbfCmWxkZSC8i6VoGg=
X-Google-Smtp-Source: 
 AMsMyM4HdAPt49plfVhTqKf/uwC3pV1pQZO/7JKgk/vNwlIJS5eBvF0b3AvM2CEnevlLPdmOJ+NdHdq4b4DC
X-Received: from jstultz-noogler2.c.googlers.com
 ([fda3:e722:ac3:cc00:24:72f4:c0a8:600])
 (user=jstultz job=sendgmr) by 2002:a17:902:d2d2:b0:177:4940:cc0f with SMTP id
 n18-20020a170902d2d200b001774940cc0fmr2371872plc.4.1663723556992; Tue, 20 Sep
 2022 18:25:56 -0700 (PDT)
Date: Wed, 21 Sep 2022 01:25:48 +0000
In-Reply-To: <20220921012550.3288570-1-jstultz@google.com>
Mime-Version: 1.0
References: <20220921012550.3288570-1-jstultz@google.com>
X-Mailer: git-send-email 2.37.3.968.ga6b4b080e4-goog
Message-ID: <20220921012550.3288570-2-jstultz@google.com>
Subject: [RFC PATCH v3 1/3] softirq: Add generic accessor to percpu
 softirq_pending data
From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: John Stultz <jstultz@google.com>, John Dias <joaodias@google.com>,
        "Connor O'Brien" <connoro@google.com>,
        Rick Yiu <rickyiu@google.com>, John Kacur <jkacur@redhat.com>,
        Qais Yousef <qais.yousef@arm.com>,
        Chris Redpath <chris.redpath@arm.com>,
        Abhijeet Dharmapurikar <adharmap@quicinc.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Heiko Carstens <hca@linux.ibm.com>,
        Vasily Gorbik <gor@linux.ibm.com>, kernel-team@android.com,
        kernel test robot <lkp@intel.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

In a previous iteration of this patch series, I was checking:

   per_cpu(irq_stat, cpu).__softirq_pending

which resulted in build errors on s390.

This patch tries to create a generic accessor to this percpu
softirq_pending data.

This interface is inherently racy as its reading percpu data
without a lock. However, being able to peek at the softirq
pending data allows us to make better decisions about rt task
placement vs just ignoring it.

On s390 this call returns 0, which maybe isn't ideal but
results in no functional change from what we do now.

Feedback or suggestions for better approach here would be
welcome!

Cc: John Dias <joaodias@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Rick Yiu <rickyiu@google.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Qais Yousef <qais.yousef@arm.com>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@quicinc.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: kernel-team@android.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: John Stultz <jstultz@google.com>
---
 arch/s390/include/asm/hardirq.h |  6 ++++++
 include/linux/interrupt.h       | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/s390/include/asm/hardirq.h b/arch/s390/include/asm/hardir=
q.h
index 58668ffb5488..cd9cc11588ab 100644
--- a/arch/s390/include/asm/hardirq.h
+++ b/arch/s390/include/asm/hardirq.h
@@ -16,6 +16,12 @@
 #define local_softirq_pending() (S390_lowcore.softirq_pending)
 #define set_softirq_pending(x) (S390_lowcore.softirq_pending =3D (x))
 #define or_softirq_pending(x)  (S390_lowcore.softirq_pending |=3D (x))
+/*
+ *  Not sure what the right thing is here  for s390,
+ *  but returning 0 will result in no logical change
+ *  from what happens now
+ */
+#define __cpu_softirq_pending(x) (0)
=20
 #define __ARCH_IRQ_STAT
 #define __ARCH_IRQ_EXIT_IRQS_DISABLED
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce40b04b..a749a8663841 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -527,6 +527,17 @@ DECLARE_STATIC_KEY_FALSE(force_irqthreads_key);
 #define set_softirq_pending(x)	(__this_cpu_write(local_softirq_pending_ref=
, (x)))
 #define or_softirq_pending(x)	(__this_cpu_or(local_softirq_pending_ref, (x=
)))
=20
+/**
+ * __cpu_softirq_pending() - Checks to see if softirq is pending on a cpu
+ *
+ * This helper is inherently racy, as we're accessing per-cpu data w/o loc=
ks.
+ * But peeking at the flag can still be useful when deciding where to plac=
e a
+ * task.
+ */
+static inline u32 __cpu_softirq_pending(int cpu)
+{
+	return (u32)per_cpu(local_softirq_pending_ref, cpu);
+}
 #endif /* local_softirq_pending */
=20
 /* Some architectures might implement lazy enabling/disabling of
--=20
2.37.3.968.ga6b4b080e4-goog
From nobody Thu Apr  2 19:49:59 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id DE028C6FA82
	for <linux-kernel@archiver.kernel.org>; Wed, 21 Sep 2022 01:27:34 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230388AbiIUB1b (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 20 Sep 2022 21:27:31 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35938 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231250AbiIUB1C (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 20 Sep 2022 21:27:02 -0400
Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com
 [IPv6:2607:f8b0:4864:20::114a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D94665A8A6
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:12 -0700 (PDT)
Received: by mail-yw1-x114a.google.com with SMTP id
 00721157ae682-34a00eb8bc7so39399137b3.0
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date;
        bh=4cyaJB870ItOL3iFaOUVsdyh9Zl3sdIvdzKxOjwqMgE=;
        b=hgfafAil+tjmm8vIdCP9RX9EsEjqnc1S7H2iZuTGKcNz+EzzGB0x1hncKR3Iux3Vmf
         alK5+Aa/2x6tcAwjGX76gMomJvdsCPafgEEy3pMCqEJ3TbELq3OIivpCZtk10ibQDGpD
         5ckaDJMtm0eMzEjp5iVpl6YPQL8YvIEWVdXJaGCxGr6rWVxTCwOJoaA8ZScegJWLoV78
         VfLOZ1VJA4+eQVWb36oVj6dht0S4swA/otZaSvUZf/gPpnbuBkgnJZtDpYvinkleEydc
         GW40qxWRJcYYqbnb9NmIEw8LRM7EZNqOI4DiSbwpC7dwB+NTSMIpvS3PKdDCc5Xn4upf
         cX9g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date;
        bh=4cyaJB870ItOL3iFaOUVsdyh9Zl3sdIvdzKxOjwqMgE=;
        b=B/EhfuIWas9bRn4kwE60cP4v+gKYhDGmU/iHqFPPevRCd4ctmiEAbEhKCMFoH6ZS4L
         8TTUfcizp7A7Vr4+u/h3cCKQi21dXpdkPyIhZBqTc1SiOLSEVL7icazHCsMMeAFTGUKW
         dN5FMquL4rgm9/bJ1DewPBMUxnyGtUiibQm34gFDXrEUnR2H2ftjlfY8PRq/q66Eiiox
         9sMEXXSgU3pTst3Qzm2FeOIOmUz40ckC4FmCVFS04jIaIijbNiDEpdIeeHGfpVDYvlMa
         tTqzLHBGk11N1woeh2hQd/3q+kcexya66A06TrW6b5giZN7tngNpGdOpYJSY4EzDYcU4
         AumQ==
X-Gm-Message-State: ACrzQf34DLMCJXq/TeuCHBthVeUeeqqINdg9l+xZ2rKz1mQkOre7xXQZ
        TeIHSTjT3FlMz9qyuT2/4fQ040fY+XOO006IAGXbSe2HKmbBwDmz5boTuAhEbZ9dd7MuixWoo9/
        8GvWGN0WbrvfLesQLD/CkeiajlcWqQS+BfH2JOqZ1KqgBJjhCWAAkraIm19ctkI6UqTwv0pQ=
X-Google-Smtp-Source: 
 AMsMyM4fxdnIrJBYi29xUhBDc+BzypCn2EyyjjbpQ8gxQQei4cZj9WboCgwaT/O1DAMZrQ5NZThY2eC1Gom+
X-Received: from jstultz-noogler2.c.googlers.com
 ([fda3:e722:ac3:cc00:24:72f4:c0a8:600])
 (user=jstultz job=sendgmr) by 2002:a25:aad0:0:b0:6a8:adba:51f0 with SMTP id
 t74-20020a25aad0000000b006a8adba51f0mr22013633ybi.639.1663723558769; Tue, 20
 Sep 2022 18:25:58 -0700 (PDT)
Date: Wed, 21 Sep 2022 01:25:49 +0000
In-Reply-To: <20220921012550.3288570-1-jstultz@google.com>
Mime-Version: 1.0
References: <20220921012550.3288570-1-jstultz@google.com>
X-Mailer: git-send-email 2.37.3.968.ga6b4b080e4-goog
Message-ID: <20220921012550.3288570-3-jstultz@google.com>
Subject: [RFC PATCH v3 2/3] sched: Avoid placing RT threads on cores handling
 long softirqs
From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: "Connor O'Brien" <connoro@google.com>,
        John Dias <joaodias@google.com>, Rick Yiu <rickyiu@google.com>,
        John Kacur <jkacur@redhat.com>,
        Qais Yousef <qais.yousef@arm.com>,
        Chris Redpath <chris.redpath@arm.com>,
        Abhijeet Dharmapurikar <adharmap@quicinc.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Thomas Gleixner <tglx@linutronix.de>, kernel-team@android.com,
        "J . Avila" <elavila@google.com>, John Stultz <jstultz@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

From: Connor O'Brien <connoro@google.com>

In certain audio use cases, scheduling RT threads on cores that
are handling softirqs can lead to glitches. Prevent this
behavior in cases where the softirq is likely to take a long
time. To avoid unnecessary migrations, the old behavior is
preserved for RCU, SCHED and TIMER irqs which are expected to be
relatively quick.

This patch reworks and combines two related changes originally
by John Dias <joaodias@google.com>

Cc: John Dias <joaodias@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Rick Yiu <rickyiu@google.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Qais Yousef <qais.yousef@arm.com>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@quicinc.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@android.com
Signed-off-by: John Dias <joaodias@google.com>
[elavila: Port to mainline, amend commit text]
Signed-off-by: J. Avila <elavila@google.com>
[connoro: Reworked, simplified, and merged two patches together]
Signed-off-by: Connor O'Brien <connoro@google.com>
[jstultz: Further simplified and fixed issues, reworded commit
 message, removed arm64-isms]
Signed-off-by: John Stultz <jstultz@google.com>
---
v2:
* Reformatted Kconfig entry to match coding style
  (Reported-by: Randy Dunlap <rdunlap@infradead.org>)
* Made rt_task_fits_capacity_and_may_preempt static to
  avoid warnings (Reported-by: kernel test robot <lkp@intel.com>)
* Rework to use preempt_count and drop kconfig dependency on ARM64
v3:
* Use introduced __cpu_softirq_pending() to avoid s390 build
  issues (Reported-by: kernel test robot <lkp@intel.com>)
---
 include/linux/interrupt.h |  7 +++++
 init/Kconfig              | 10 ++++++
 kernel/sched/rt.c         | 64 +++++++++++++++++++++++++++++++++------
 kernel/softirq.c          |  9 ++++++
 4 files changed, 81 insertions(+), 9 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a749a8663841..1d126b8495bc 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -582,6 +582,12 @@ enum
  * _ IRQ_POLL: irq_poll_cpu_dead() migrates the queue
  */
 #define SOFTIRQ_HOTPLUG_SAFE_MASK (BIT(RCU_SOFTIRQ) | BIT(IRQ_POLL_SOFTIRQ=
))
+/* Softirq's where the handling might be long: */
+#define LONG_SOFTIRQ_MASK ((1 << NET_TX_SOFTIRQ)       | \
+			   (1 << NET_RX_SOFTIRQ)       | \
+			   (1 << BLOCK_SOFTIRQ)        | \
+			   (1 << IRQ_POLL_SOFTIRQ) | \
+			   (1 << TASKLET_SOFTIRQ))
=20
 /* map softirq index to softirq name. update 'softirq_to_name' in
  * kernel/softirq.c when adding a new softirq.
@@ -617,6 +623,7 @@ extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
=20
 DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+DECLARE_PER_CPU(u32, active_softirqs);
=20
 static inline struct task_struct *this_cpu_ksoftirqd(void)
 {
diff --git a/init/Kconfig b/init/Kconfig
index 532362fcfe31..8b5add74b6cb 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1284,6 +1284,16 @@ config SCHED_AUTOGROUP
 	  desktop applications.  Task group autogeneration is currently based
 	  upon task session.
=20
+config RT_SOFTIRQ_OPTIMIZATION
+	bool "Improve RT scheduling during long softirq execution"
+	depends on SMP
+	default n
+	help
+	  Enable an optimization which tries to avoid placing RT tasks on CPUs
+	  occupied by nonpreemptible tasks, such as a long softirq or CPUs
+	  which may soon block preemptions, such as a CPU running a ksoftirq
+	  thread which handles slow softirqs.
+
 config SYSFS_DEPRECATED
 	bool "Enable deprecated sysfs features to support old userspace tools"
 	depends on SYSFS
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 55f39c8f4203..826f56daecc5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1599,12 +1599,49 @@ static void yield_task_rt(struct rq *rq)
 #ifdef CONFIG_SMP
 static int find_lowest_rq(struct task_struct *task);
=20
+#ifdef CONFIG_RT_SOFTIRQ_OPTIMIZATION
+/*
+ * Return whether the task on the given cpu is currently non-preemptible
+ * while handling a potentially long softirq, or if the task is likely
+ * to block preemptions soon because it is a ksoftirq thread that is
+ * handling slow softirq.
+ */
+static bool task_may_preempt(struct task_struct *task, int cpu)
+{
+	u32 softirqs =3D per_cpu(active_softirqs, cpu) |
+		       __cpu_softirq_pending(cpu);
+	struct task_struct *cpu_ksoftirqd =3D per_cpu(ksoftirqd, cpu);
+	struct task_struct *curr;
+	struct rq *rq =3D cpu_rq(cpu);
+	int ret;
+
+	rcu_read_lock();
+	curr =3D READ_ONCE(rq->curr); /* unlocked access */
+	ret =3D !((softirqs & LONG_SOFTIRQ_MASK) &&
+		 (curr =3D=3D cpu_ksoftirqd ||
+		  preempt_count() & SOFTIRQ_MASK));
+	rcu_read_unlock();
+	return ret;
+}
+#else
+static bool task_may_preempt(struct task_struct *task, int cpu)
+{
+	return true;
+}
+#endif /* CONFIG_RT_SOFTIRQ_OPTIMIZATION */
+
+static bool rt_task_fits_capacity_and_may_preempt(struct task_struct *p, i=
nt cpu)
+{
+	return task_may_preempt(p, cpu) && rt_task_fits_capacity(p, cpu);
+}
+
 static int
 select_task_rq_rt(struct task_struct *p, int cpu, int flags)
 {
 	struct task_struct *curr;
 	struct rq *rq;
 	bool test;
+	bool may_not_preempt;
=20
 	/* For anything but wake ups, just return the task_cpu */
 	if (!(flags & (WF_TTWU | WF_FORK)))
@@ -1616,7 +1653,12 @@ select_task_rq_rt(struct task_struct *p, int cpu, in=
t flags)
 	curr =3D READ_ONCE(rq->curr); /* unlocked access */
=20
 	/*
-	 * If the current task on @p's runqueue is an RT task, then
+	 * If the current task on @p's runqueue is a softirq task,
+	 * it may run without preemption for a time that is
+	 * ill-suited for a waiting RT task. Therefore, try to
+	 * wake this RT task on another runqueue.
+	 *
+	 * Also, if the current task on @p's runqueue is an RT task, then
 	 * try to see if we can wake this RT task up on another
 	 * runqueue. Otherwise simply start this RT task
 	 * on its current runqueue.
@@ -1641,9 +1683,10 @@ select_task_rq_rt(struct task_struct *p, int cpu, in=
t flags)
 	 * requirement of the task - which is only important on heterogeneous
 	 * systems like big.LITTLE.
 	 */
-	test =3D curr &&
-	       unlikely(rt_task(curr)) &&
-	       (curr->nr_cpus_allowed < 2 || curr->prio <=3D p->prio);
+	may_not_preempt =3D !task_may_preempt(curr, cpu);
+	test =3D (curr && (may_not_preempt ||
+			 (unlikely(rt_task(curr)) &&
+			  (curr->nr_cpus_allowed < 2 || curr->prio <=3D p->prio))));
=20
 	if (test || !rt_task_fits_capacity(p, cpu)) {
 		int target =3D find_lowest_rq(p);
@@ -1656,11 +1699,14 @@ select_task_rq_rt(struct task_struct *p, int cpu, i=
nt flags)
 			goto out_unlock;
=20
 		/*
-		 * Don't bother moving it if the destination CPU is
+		 * If cpu is non-preemptible, prefer remote cpu
+		 * even if it's running a higher-prio task.
+		 * Otherwise: Don't bother moving it if the destination CPU is
 		 * not running a lower priority task.
 		 */
 		if (target !=3D -1 &&
-		    p->prio < cpu_rq(target)->rt.highest_prio.curr)
+		    (may_not_preempt ||
+		     p->prio < cpu_rq(target)->rt.highest_prio.curr))
 			cpu =3D target;
 	}
=20
@@ -1901,11 +1947,11 @@ static int find_lowest_rq(struct task_struct *task)
=20
 		ret =3D cpupri_find_fitness(&task_rq(task)->rd->cpupri,
 					  task, lowest_mask,
-					  rt_task_fits_capacity);
+					  rt_task_fits_capacity_and_may_preempt);
 	} else {
=20
-		ret =3D cpupri_find(&task_rq(task)->rd->cpupri,
-				  task, lowest_mask);
+		ret =3D cpupri_find_fitness(&task_rq(task)->rd->cpupri,
+					  task, lowest_mask, task_may_preempt);
 	}
=20
 	if (!ret)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..35ee79dd8786 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -60,6 +60,13 @@ static struct softirq_action softirq_vec[NR_SOFTIRQS] __=
cacheline_aligned_in_smp
=20
 DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
=20
+/*
+ * active_softirqs -- per cpu, a mask of softirqs that are being handled,
+ * with the expectation that approximate answers are acceptable and theref=
ore
+ * no synchronization.
+ */
+DEFINE_PER_CPU(u32, active_softirqs);
+
 const char * const softirq_to_name[NR_SOFTIRQS] =3D {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
 	"TASKLET", "SCHED", "HRTIMER", "RCU"
@@ -551,6 +558,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(=
void)
 restart:
 	/* Reset the pending bitmask before enabling irqs */
 	set_softirq_pending(0);
+	__this_cpu_write(active_softirqs, pending);
=20
 	local_irq_enable();
=20
@@ -580,6 +588,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(=
void)
 		pending >>=3D softirq_bit;
 	}
=20
+	__this_cpu_write(active_softirqs, 0);
 	if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
 	    __this_cpu_read(ksoftirqd) =3D=3D current)
 		rcu_softirq_qs();
--=20
2.37.3.968.ga6b4b080e4-goog
From nobody Thu Apr  2 19:49:59 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id AABEFC6FA82
	for <linux-kernel@archiver.kernel.org>; Wed, 21 Sep 2022 01:27:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231442AbiIUB1i (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 20 Sep 2022 21:27:38 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34484 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231620AbiIUB1D (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 20 Sep 2022 21:27:03 -0400
Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com
 [IPv6:2607:f8b0:4864:20::64a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92FD92AC4D
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:14 -0700 (PDT)
Received: by mail-pl1-x64a.google.com with SMTP id
 k2-20020a170902c40200b001782bd6c416so2811926plk.20
        for <linux-kernel@vger.kernel.org>;
 Tue, 20 Sep 2022 18:26:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date;
        bh=o4GP7+7x/i3LtS7KDRY9k8uFn9y3A3mtuqvfdAnDEoY=;
        b=fplXy7xMgyYMI0/pnBdIDKGKPtgJG9PvrNQ89OyMBQQMHbmB9zpEMFcCoYHvKLSUOT
         R1ID/N2FjdFsSkIt27kDa5UWzPyWUh237BWk1qhWjSZ3noAmHpMqIcxwChCkZy7MtCV/
         RrBeb1KmC3esXIM4nmuo2j8wtLEyHCvfErS5d2+ICdj8wGu0ok/WLFbt7+9ApfaLAprw
         hgqMAV9SbFZIIovy2eSHC8DnqKh1nstwiQat4up9D3YaIcoccgcIpZ4VvMXTl0yxRgQr
         FD/ZsrthLVzgVI+5+wEoBrWLK2qHbMXcqi6QGlJj6tXoZ0BM45Oco3LrhthlmajSQSEG
         ObSA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date;
        bh=o4GP7+7x/i3LtS7KDRY9k8uFn9y3A3mtuqvfdAnDEoY=;
        b=WbVr8dffVv4CyhvFjhyZYu+g450kxkJQgF8a67g4F/EmcW+OOkdY3owT18roeSsmFR
         oIOdrAnl+Gos4XFXclKV2FcrgP0we3PCE96n50fjOlAUe+4H4cqlgCT5xnDpA05p1xAe
         3jNs/7ai43Yv1487lbYuIEMbgyCBNbz1QNEFsKKCBv5SMMteNe8vEfYCtXb6bQEYnM/j
         RCM09PaxpWQi8tro492a2dxMOadsO2PCX6noEF29gd1Z4LaUz34lqKDFqBpn6RtAiseT
         bCFJapMsptHoEChp59MYj0MeUDbwTVrQKJMqcxSO0QVzN8P/lrcRQsnkEqTFw7wRvTTW
         Ff6A==
X-Gm-Message-State: ACrzQf2M7KDpG1Uyieo2D65c1gu1UjVivjibbELwU/gofWZAZRRbIiuj
        f/CJyS+cEdfcLgU9ir9Rkmxd0wiW+nXy5Qt6TOYZ3A0o8GW20YesPZK/FvsHratrBNqsv+m94MD
        FMjD1ouwybh2bQldLbmBcr3qGuX1A9lmkuXzErcUb1MXLHRt7kdVgCGolSTKi9K7UfT4jX+4=
X-Google-Smtp-Source: 
 AMsMyM6VqagCzwuSTSYERNYGUCuQNM8wx242vXGPOX2H8q4XFF1j/srEpI+pIrXLI11alZemQldL7iqKGW2v
X-Received: from jstultz-noogler2.c.googlers.com
 ([fda3:e722:ac3:cc00:24:72f4:c0a8:600])
 (user=jstultz job=sendgmr) by 2002:a62:1b8f:0:b0:54b:8114:e762 with SMTP id
 b137-20020a621b8f000000b0054b8114e762mr22951951pfb.7.1663723560518; Tue, 20
 Sep 2022 18:26:00 -0700 (PDT)
Date: Wed, 21 Sep 2022 01:25:50 +0000
In-Reply-To: <20220921012550.3288570-1-jstultz@google.com>
Mime-Version: 1.0
References: <20220921012550.3288570-1-jstultz@google.com>
X-Mailer: git-send-email 2.37.3.968.ga6b4b080e4-goog
Message-ID: <20220921012550.3288570-4-jstultz@google.com>
Subject: [RFC PATCH v3 3/3] softirq: defer softirq processing to ksoftirqd if
 CPU is busy with RT
From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Pavankumar Kondeti <pkondeti@codeaurora.org>,
        John Dias <joaodias@google.com>,
        "Connor O'Brien" <connoro@google.com>,
        Rick Yiu <rickyiu@google.com>, John Kacur <jkacur@redhat.com>,
        Qais Yousef <qais.yousef@arm.com>,
        Chris Redpath <chris.redpath@arm.com>,
        Abhijeet Dharmapurikar <adharmap@quicinc.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Thomas Gleixner <tglx@linutronix.de>, kernel-team@android.com,
        Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>,
        "J . Avila" <elavila@google.com>, John Stultz <jstultz@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

From: Pavankumar Kondeti <pkondeti@codeaurora.org>

Defer the softirq processing to ksoftirqd if a RT task is
running or queued on the current CPU. This complements the RT
task placement algorithm which tries to find a CPU that is not
currently busy with softirqs.

Currently NET_TX, NET_RX, BLOCK and TASKLET softirqs are only
deferred as they can potentially run for long time.

Additionally, this patch stubs out ksoftirqd_running() logic,
in the CONFIG_RT_SOFTIRQ_OPTIMIZATION case, as deferring
potentially long-running softirqs will cause the logic to not
process shorter-running softirqs immediately. By stubbing it out
the potentially long running softirqs are deferred, but the
shorter running ones can still run immediately.

This patch includes folded-in fixes by:
  Lingutla Chandrasekhar <clingutla@codeaurora.org>
  Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
  J. Avila <elavila@google.com>

Cc: John Dias <joaodias@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Rick Yiu <rickyiu@google.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Qais Yousef <qais.yousef@arm.com>
Cc: Chris Redpath <chris.redpath@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@quicinc.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@android.com
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
[satyap@codeaurora.org: trivial merge conflict resolution.]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
[elavila: Port to mainline, squash with bugfix]
Signed-off-by: J. Avila <elavila@google.com>
[jstultz: Rebase to linus/HEAD, minor rearranging of code,
 included bug fix Reported-by: Qais Yousef <qais.yousef@arm.com> ]
Signed-off-by: John Stultz <jstultz@google.com>
---
 include/linux/sched.h | 10 ++++++++++
 kernel/sched/cpupri.c | 13 +++++++++++++
 kernel/softirq.c      | 25 +++++++++++++++++++++++--
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e7b2f8a5c711..7f76371cbbb0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1826,6 +1826,16 @@ current_restore_flags(unsigned long orig_flags, unsi=
gned long flags)
=20
 extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const stru=
ct cpumask *trial);
 extern int task_can_attach(struct task_struct *p, const struct cpumask *cs=
_effective_cpus);
+
+#ifdef CONFIG_RT_SOFTIRQ_OPTIMIZATION
+extern bool cpupri_check_rt(void);
+#else
+static inline bool cpupri_check_rt(void)
+{
+	return false;
+}
+#endif
+
 #ifdef CONFIG_SMP
 extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumas=
k *new_mask);
 extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumas=
k *new_mask);
diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index fa9ce9d83683..18dc75d16951 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -64,6 +64,19 @@ static int convert_prio(int prio)
 	return cpupri;
 }
=20
+#ifdef CONFIG_RT_SOFTIRQ_OPTIMIZATION
+/*
+ * cpupri_check_rt - check if CPU has a RT task
+ * should be called from rcu-sched read section.
+ */
+bool cpupri_check_rt(void)
+{
+	int cpu =3D raw_smp_processor_id();
+
+	return cpu_rq(cpu)->rd->cpupri.cpu_to_pri[cpu] > CPUPRI_NORMAL;
+}
+#endif
+
 static inline int __cpupri_find(struct cpupri *cp, struct task_struct *p,
 				struct cpumask *lowest_mask, int idx)
 {
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 35ee79dd8786..203a70dc9459 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -87,6 +87,7 @@ static void wakeup_softirqd(void)
 		wake_up_process(tsk);
 }
=20
+#ifndef CONFIG_RT_SOFTIRQ_OPTIMIZATION
 /*
  * If ksoftirqd is scheduled, we do not want to process pending softirqs
  * right now. Let ksoftirqd handle this at its own rate, to get fairness,
@@ -101,6 +102,9 @@ static bool ksoftirqd_running(unsigned long pending)
 		return false;
 	return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
 }
+#else
+#define ksoftirqd_running(pending) (false)
+#endif /* CONFIG_RT_SOFTIRQ_OPTIMIZATION */
=20
 #ifdef CONFIG_TRACE_IRQFLAGS
 DEFINE_PER_CPU(int, hardirqs_enabled);
@@ -532,6 +536,17 @@ static inline bool lockdep_softirq_start(void) { retur=
n false; }
 static inline void lockdep_softirq_end(bool in_hardirq) { }
 #endif
=20
+static __u32 softirq_deferred_for_rt(__u32 *pending)
+{
+	__u32 deferred =3D 0;
+
+	if (cpupri_check_rt()) {
+		deferred =3D *pending & LONG_SOFTIRQ_MASK;
+		*pending &=3D ~LONG_SOFTIRQ_MASK;
+	}
+	return deferred;
+}
+
 asmlinkage __visible void __softirq_entry __do_softirq(void)
 {
 	unsigned long end =3D jiffies + MAX_SOFTIRQ_TIME;
@@ -539,6 +554,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(=
void)
 	int max_restart =3D MAX_SOFTIRQ_RESTART;
 	struct softirq_action *h;
 	bool in_hardirq;
+	__u32 deferred;
 	__u32 pending;
 	int softirq_bit;
=20
@@ -551,13 +567,15 @@ asmlinkage __visible void __softirq_entry __do_softir=
q(void)
=20
 	pending =3D local_softirq_pending();
=20
+	deferred =3D softirq_deferred_for_rt(&pending);
 	softirq_handle_begin();
+
 	in_hardirq =3D lockdep_softirq_start();
 	account_softirq_enter(current);
=20
 restart:
 	/* Reset the pending bitmask before enabling irqs */
-	set_softirq_pending(0);
+	set_softirq_pending(deferred);
 	__this_cpu_write(active_softirqs, pending);
=20
 	local_irq_enable();
@@ -596,13 +614,16 @@ asmlinkage __visible void __softirq_entry __do_softir=
q(void)
 	local_irq_disable();
=20
 	pending =3D local_softirq_pending();
+	deferred =3D softirq_deferred_for_rt(&pending);
+
 	if (pending) {
 		if (time_before(jiffies, end) && !need_resched() &&
 		    --max_restart)
 			goto restart;
+	}
=20
+	if (pending | deferred)
 		wakeup_softirqd();
-	}
=20
 	account_softirq_exit(current);
 	lockdep_softirq_end(in_hardirq);
--=20
2.37.3.968.ga6b4b080e4-goog