From nobody Sat Feb 7 19:41:07 2026 Received: from mail-dy1-f175.google.com (mail-dy1-f175.google.com [74.125.82.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A212347C7 for ; Tue, 3 Feb 2026 03:04:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770087853; cv=none; b=iZUiRL/CkJ4yCYG93L9Kn0192lUkjAjeBMmL4DhXz7KfPD8vNHf88KjXUbA49Iv2vWMNJjNXMZjdlF8CNh+A6PjBHs1MHDODsZu/pHOmrtEO1FR79+ZXk3W/YyAlk3TauQ33ITfGV1TFKKRn0+Jqse1f27G9cT2Tlt/43S+GTJ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770087853; c=relaxed/simple; bh=oX+fdVlBvsXM8XeJprMSbEC3P7FuDL6fEIb2Mgv2lb8=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=PFlwdz/xXexdJSCsZ40xLkK3mhIsFoKClY6b01A003/8XKuLA9Bxy8ujqEA1pdoFyKNzelOPMMiq1+tlq3XWWlnpLuxuZGwO6wQ6zJPfNKix/sVzsgWJh3u2Ssc6otJ4yvMyXd7gLFR2M8DxIoTIfAGBTHEw6OKYuDpz8fwyDrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HmQLYInu; arc=none smtp.client-ip=74.125.82.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HmQLYInu" Received: by mail-dy1-f175.google.com with SMTP id 5a478bee46e88-2b7da62b487so6626986eec.1 for ; Mon, 02 Feb 2026 19:04:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770087851; x=1770692651; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=2JLE6DM3qTziaQG75UpuvvvsMcKDWk2318vIqvOxhbs=; b=HmQLYInu3uCsJZxiNcC9MQs4IuOGxyHvxe3t6O+PLk2qtipKWk+yY7t6+DsRqUhOHk +CCnLn/hSai+fJbQBdjLHeKaZAkpzjG2fYgWDGwSVn6Wg6hVZWN0AWS9cstzWy6vPTOv KJw5RG0nGL/gIuUUalx+cJ0mceSyfZ5WvIsUdWGeTF2gJNFZARsa/d7kSwwzuTzknEUS ElLP65ExOiYTKp5VDn3YyEshmDvvVi5L0CrFOavsvkNm3mhoOBhJYpp+A0KfXyi3hNKy pTK9wzKgdT7V0W1RFLXe1a4tHYVOLvNJfnvUv6orH/fcSldhE8PEIKsEZMMsaPtnsA+k Cqig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770087851; x=1770692651; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2JLE6DM3qTziaQG75UpuvvvsMcKDWk2318vIqvOxhbs=; b=exw2v60Iz4ipAnXb89kTOrkS8FW75Yq+bEE8RWBOr8hhcXk4E1W3BRhcsb+xm+rNAW rgERmc7iwsfLLcjI5ZxwTYCX75ZLRKWm/DrRx/jIcMvOmmj1ZOfVhPVkxlwLZnyvXmmQ AhCWBX4l/uz5rtZkRQlaGTCGqQQelkPEN4Isu3hoDjf6PKpKASEuoY2h0cxTs7TPUELp 5evkWZN6PyZX3D0Oeg3+4fdJ2tnNGXnWrqDQTulDUNev1c30XVizzbIVilACK/wmLZbt iBB/60D7LobUEyDqXEIeglOI97utUgwAcUpOWk9rHI5hKE+OFOPvlRLgCv7JTWlM0u6J 1cCQ== X-Forwarded-Encrypted: i=1; AJvYcCWHZQu4GXqbqfxXAs/wH3Z8L4qyy/tv5ZsLkdabhOU/yWwEPVY5gMrZ3y/r+P1YaPhUERRh/59cgE3vRJY=@vger.kernel.org X-Gm-Message-State: AOJu0YwePcfcWshIysMtxloPWWpB7C25eFGfphtF/2IiHsNq1UkzIoqX j/Baq5UM867PuRxdVkELBKw62gwagiKQjVcsqNsyU4Z4QdYr/PES8gsj X-Gm-Gg: AZuq6aKH/9ksv+jBQ4bYubVek8eq7Lc2VuYv20cAaay0qxNBZf3RlplHVFyvQUWyqBD Mn3xchFfbDP0/gbhlS8TDiZdlshiJ4BHXdIgFY20g2ShVRmc/spbxQcLAr+ynG3bISQntK0Y+GZ x0MsouafWvDMBlDpWGofv7bMQNWM7tRQznA4NBtDgMwSnI+nCGmQxycAgmR17s9GUCsom0Ebv87 /Mb4iyQg40270H5qUv6p8K0rOHOvIuY7sLZ4MH5ShuO5gSjlzYnEMy/dJRAWqDxvCHZ7sWcx92C 0Lkyy34gImCGmMNGFMDXFZIG1shyMb13rk9ZjcRYMtzuaFwsKey9RORCT55eqNUB2xAkUtEKbVN HjhzFh/wCYNE7PL5k6irSyXYuszlJ2UAOj/vbjaeDjigm180JqcsqY9/Bn+5g/CaqgayAmg5gah lI6VA= X-Received: by 2002:a05:693c:61c3:b0:2b7:ef84:9123 with SMTP id 5a478bee46e88-2b7ef8499d5mr3581505eec.41.1770087851435; Mon, 02 Feb 2026 19:04:11 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b7a1adef97sm20980187eec.29.2026.02.02.19.04.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Feb 2026 19:04:10 -0800 (PST) From: Qiliang Yuan To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Tejun Heo , Andrea Righi , Emil Tsalapatis , Qiliang Yuan , Ryan Newton , David Dai , zhidao su , Jake Hillion Cc: Qiliang Yuan , David Vernet , Changwoo Min , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Dan Schatzberg , sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH] sched/ext: Add cpumask to skip unsuitable dispatch queues Date: Mon, 2 Feb 2026 22:03:46 -0500 Message-ID: <20260203030400.3313990-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a cpumask field to struct scx_dispatch_q to track the union of allowed CPUs for all tasks in the queue. Use this mask to perform an O(1) check in consume_dispatch_q() before scanning the queue. When a CPU attempts to consume from a queue, it currently must iterate through all N tasks to determine if any can run on that CPU. If the queue contains only tasks pinned to other CPUs (via sched_setaffinity or cgroups), this O(N) scan finds nothing. With the cpumask, if the current CPU is not in the allowed set, skip the entire queue immediately with a single bit test. This changes the "queue is unsuitable" case from O(N) to O(1). The mask is updated when tasks are enqueued and cleared when the queue becomes empty, preventing permanent saturation from transient pinned tasks. This benefits large systems with CPU-pinned workloads, where CPUs frequently scan queues containing no eligible tasks. Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan --- include/linux/sched/ext.h | 1 + kernel/sched/ext.c | 21 ++++++++++++++++++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h index bcb962d5ee7d..f20e57cf53a3 100644 --- a/include/linux/sched/ext.h +++ b/include/linux/sched/ext.h @@ -79,6 +79,7 @@ struct scx_dispatch_q { struct rhash_head hash_node; struct llist_node free_node; struct rcu_head rcu; + struct cpumask *cpus_allowed; /* union of all tasks' allowed cpus */ }; =20 /* scx_entity.flags */ diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index afe28c04d5aa..5a060c97cd64 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1120,8 +1120,12 @@ static void dispatch_enqueue(struct scx_sched *sch, = struct scx_dispatch_q *dsq, =20 if (is_local) local_dsq_post_enq(dsq, p, enq_flags); - else + else { + /* Update cpumask to track union of all tasks' allowed CPUs */ + if (dsq->cpus_allowed) + cpumask_or(dsq->cpus_allowed, dsq->cpus_allowed, p->cpus_ptr); raw_spin_unlock(&dsq->lock); + } } =20 static void task_unlink_from_dsq(struct task_struct *p, @@ -1138,6 +1142,10 @@ static void task_unlink_from_dsq(struct task_struct = *p, list_del_init(&p->scx.dsq_list.node); dsq_mod_nr(dsq, -1); =20 + /* Clear cpumask when queue becomes empty to prevent saturation */ + if (dsq->nr =3D=3D 0 && dsq->cpus_allowed) + cpumask_clear(dsq->cpus_allowed); + if (!(dsq->id & SCX_DSQ_FLAG_BUILTIN) && dsq->first_task =3D=3D p) { struct task_struct *first_task; =20 @@ -1897,6 +1905,14 @@ static bool consume_dispatch_q(struct scx_sched *sch= , struct rq *rq, if (list_empty(&dsq->list)) return false; =20 + /* + * O(1) optimization: Check if any task in the queue can run on this CPU. + * If the cpumask is allocated and this CPU is not in the allowed set, + * we can skip the entire queue without scanning. + */ + if (dsq->cpus_allowed && !cpumask_test_cpu(cpu_of(rq), dsq->cpus_allowed)) + return false; + raw_spin_lock(&dsq->lock); =20 nldsq_for_each_task(p, dsq) { @@ -3397,6 +3413,9 @@ static void init_dsq(struct scx_dispatch_q *dsq, u64 = dsq_id) raw_spin_lock_init(&dsq->lock); INIT_LIST_HEAD(&dsq->list); dsq->id =3D dsq_id; +=09 + /* Allocate cpumask for tracking allowed CPUs */ + dsq->cpus_allowed =3D kzalloc(cpumask_size(), GFP_KERNEL); } =20 static void free_dsq_irq_workfn(struct irq_work *irq_work) --=20 2.51.0