[PATCH RESEND] sched: Change nr_uninterruptible type to unsigned long

Aruna Ramakrishna posted 1 patch 2 months, 4 weeks ago
kernel/sched/loadavg.c | 2 +-
kernel/sched/sched.h   | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
[PATCH RESEND] sched: Change nr_uninterruptible type to unsigned long
Posted by Aruna Ramakrishna 2 months, 4 weeks ago
The commit e6fe3f422be1 ("sched: Make multiple runqueue task counters
32-bit") changed nr_uninterruptible to an unsigned int. But the
nr_uninterruptible values for each of the CPU runqueues can grow to
large numbers, sometimes exceeding INT_MAX. This is valid, if, over
time, a large number of tasks are migrated off of one CPU after going
into an uninterruptible state. Only the sum of all nr_interruptible
values across all CPUs yields the correct result, as explained in a
comment in kernel/sched/loadavg.c.

Change the type of nr_uninterruptible back to unsigned long to prevent
overflows, and thus the miscalculation of load average.

Fixes: e6fe3f422be1 ("sched: Make multiple runqueue task counters 32-bit")

Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
---
 kernel/sched/loadavg.c | 2 +-
 kernel/sched/sched.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index c48900b856a2..52ca8e268cfc 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -80,7 +80,7 @@ long calc_load_fold_active(struct rq *this_rq, long adjust)
 	long nr_active, delta = 0;
 
 	nr_active = this_rq->nr_running - adjust;
-	nr_active += (int)this_rq->nr_uninterruptible;
+	nr_active += (long)this_rq->nr_uninterruptible;
 
 	if (nr_active != this_rq->calc_load_active) {
 		delta = nr_active - this_rq->calc_load_active;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 475bb5998295..83e3aa917142 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1149,7 +1149,7 @@ struct rq {
 	 * one CPU and if it got migrated afterwards it may decrease
 	 * it on another CPU. Always updated under the runqueue lock:
 	 */
-	unsigned int		nr_uninterruptible;
+	unsigned long 		nr_uninterruptible;
 
 	union {
 		struct task_struct __rcu *donor; /* Scheduler context */

base-commit: 86731a2a651e58953fc949573895f2fa6d456841
prerequisite-patch-id: dd6db7012c5094dec89e689ba56fd3551d2b4a40
-- 
2.43.5
Re: [PATCH RESEND] sched: Change nr_uninterruptible type to unsigned long
Posted by Peter Zijlstra 2 months, 3 weeks ago
On Wed, Jul 09, 2025 at 05:33:28PM +0000, Aruna Ramakrishna wrote:
> The commit e6fe3f422be1 ("sched: Make multiple runqueue task counters
> 32-bit") changed nr_uninterruptible to an unsigned int. But the
> nr_uninterruptible values for each of the CPU runqueues can grow to
> large numbers, sometimes exceeding INT_MAX. This is valid, if, over
> time, a large number of tasks are migrated off of one CPU after going
> into an uninterruptible state. Only the sum of all nr_interruptible
> values across all CPUs yields the correct result, as explained in a
> comment in kernel/sched/loadavg.c.
> 
> Change the type of nr_uninterruptible back to unsigned long to prevent
> overflows, and thus the miscalculation of load average.
> 
> Fixes: e6fe3f422be1 ("sched: Make multiple runqueue task counters 32-bit")
> 
> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>

Thanks!
[tip: sched/urgent] sched: Change nr_uninterruptible type to unsigned long
Posted by tip-bot2 for Aruna Ramakrishna 2 months, 3 weeks ago
The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     36569780b0d64de283f9d6c2195fd1a43e221ee8
Gitweb:        https://git.kernel.org/tip/36569780b0d64de283f9d6c2195fd1a43e221ee8
Author:        Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
AuthorDate:    Wed, 09 Jul 2025 17:33:28 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 14 Jul 2025 10:59:31 +02:00

sched: Change nr_uninterruptible type to unsigned long

The commit e6fe3f422be1 ("sched: Make multiple runqueue task counters
32-bit") changed nr_uninterruptible to an unsigned int. But the
nr_uninterruptible values for each of the CPU runqueues can grow to
large numbers, sometimes exceeding INT_MAX. This is valid, if, over
time, a large number of tasks are migrated off of one CPU after going
into an uninterruptible state. Only the sum of all nr_interruptible
values across all CPUs yields the correct result, as explained in a
comment in kernel/sched/loadavg.c.

Change the type of nr_uninterruptible back to unsigned long to prevent
overflows, and thus the miscalculation of load average.

Fixes: e6fe3f422be1 ("sched: Make multiple runqueue task counters 32-bit")

Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250709173328.606794-1-aruna.ramakrishna@oracle.com
---
 kernel/sched/loadavg.c | 2 +-
 kernel/sched/sched.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index c48900b..52ca8e2 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -80,7 +80,7 @@ long calc_load_fold_active(struct rq *this_rq, long adjust)
 	long nr_active, delta = 0;
 
 	nr_active = this_rq->nr_running - adjust;
-	nr_active += (int)this_rq->nr_uninterruptible;
+	nr_active += (long)this_rq->nr_uninterruptible;
 
 	if (nr_active != this_rq->calc_load_active) {
 		delta = nr_active - this_rq->calc_load_active;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 475bb59..83e3aa9 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1149,7 +1149,7 @@ struct rq {
 	 * one CPU and if it got migrated afterwards it may decrease
 	 * it on another CPU. Always updated under the runqueue lock:
 	 */
-	unsigned int		nr_uninterruptible;
+	unsigned long 		nr_uninterruptible;
 
 	union {
 		struct task_struct __rcu *donor; /* Scheduler context */