From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A71F734CFD4; Wed, 5 Nov 2025 21:04:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376652; cv=none; b=SIlCGA7xZ0CzsyYKnQ728bvfhfX6sV3ukYJsipuyGI0JmHqc7iEPMXAd7ajqww1JbDBeFNll7VnrLbmY+n2CLHRSF93E4YqbwbI0vSY1BKK9yGlAVHPkNhl9XVOZp6S39R3LgDyYMeMKSlA7iPKacQp8LPiUYWL1e6YovW6qMow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376652; c=relaxed/simple; bh=8Q71EO557AK60zUjy5zJPPkN07FPtYd1UuHTAzHqsFk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AziKYoli+ok5+UGMKOm+eJ1IuFyKk0hz67fCC1R+Bj80r/oQM8Uzux4AzvNv7NQfeEYo3c3TPi8AmVDcSUYU/IUCGz1nW0ybEx+gxioDQsMFR5LRkEUKGth68r6fjitQHmzoc1+D6m1GjMlIQjHPVfAxlNHq5sNKAH2ibB1Llhs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AJURCUSi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AJURCUSi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13AB5C116D0; Wed, 5 Nov 2025 21:04:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376650; bh=8Q71EO557AK60zUjy5zJPPkN07FPtYd1UuHTAzHqsFk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AJURCUSilbHUliUZRLrgr+YTl80lHSkFVtDFlHcKKO/sO5PBq/MhEJUXJz916fP81 wAbJ8Yy5s6zsZm9lBVZoeQXrmyCrVMNai4ryobrKlPGWI6PI6p6Jep72D7ahE5dvcd k4C/l6tC0UJksuZsxtngb12Gs/wAUhM8HZr8RudMlyYPQQHJ6z+N1Fiqf5+++lztgm sINWjeiVCtGxC9uvySEPj5+spW1J3KjyQn3Fa6fEASxYDs+prWyAMn68bY/cKaaxjd txBp5Oc4fioT76vUk0O5Kgl+uyh1HTeyhNjK41/nevk/wEXq14ugto76jhLUchVfJp 9jeOdMslbHixg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 01/31] PCI: Prepare to protect against concurrent isolated cpuset change Date: Wed, 5 Nov 2025 22:03:17 +0100 Message-ID: <20251105210348.35256-2-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN will soon integrate cpuset isolated partitions and therefore be made modifiable at runtime. Synchronize against the cpumask update using RCU. The RCU locked section includes both the housekeeping CPU target election for the PCI probe work and the work enqueue. This way the housekeeping update side will simply need to flush the pending related works after updating the housekeeping mask in order to make sure that no PCI work ever executes on an isolated CPU. This part will be handled in a subsequent patch. Signed-off-by: Frederic Weisbecker --- drivers/pci/pci-driver.c | 47 ++++++++++++++++++++++++++++++++-------- 1 file changed, 38 insertions(+), 9 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 302d61783f6c..7b74d22b20f7 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -302,9 +302,8 @@ struct drv_dev_and_id { const struct pci_device_id *id; }; =20 -static long local_pci_probe(void *_ddi) +static int local_pci_probe(struct drv_dev_and_id *ddi) { - struct drv_dev_and_id *ddi =3D _ddi; struct pci_dev *pci_dev =3D ddi->dev; struct pci_driver *pci_drv =3D ddi->drv; struct device *dev =3D &pci_dev->dev; @@ -338,6 +337,19 @@ static long local_pci_probe(void *_ddi) return 0; } =20 +struct pci_probe_arg { + struct drv_dev_and_id *ddi; + struct work_struct work; + int ret; +}; + +static void local_pci_probe_callback(struct work_struct *work) +{ + struct pci_probe_arg *arg =3D container_of(work, struct pci_probe_arg, wo= rk); + + arg->ret =3D local_pci_probe(arg->ddi); +} + static bool pci_physfn_is_probed(struct pci_dev *dev) { #ifdef CONFIG_PCI_IOV @@ -362,34 +374,51 @@ static int pci_call_probe(struct pci_driver *drv, str= uct pci_dev *dev, dev->is_probed =3D 1; =20 cpu_hotplug_disable(); - /* * Prevent nesting work_on_cpu() for the case where a Virtual Function * device is probed from work_on_cpu() of the Physical device. */ if (node < 0 || node >=3D MAX_NUMNODES || !node_online(node) || pci_physfn_is_probed(dev)) { - cpu =3D nr_cpu_ids; + error =3D local_pci_probe(&ddi); } else { cpumask_var_t wq_domain_mask; + struct pci_probe_arg arg =3D { .ddi =3D &ddi }; + + INIT_WORK_ONSTACK(&arg.work, local_pci_probe_callback); =20 if (!zalloc_cpumask_var(&wq_domain_mask, GFP_KERNEL)) { error =3D -ENOMEM; goto out; } + + /* + * The target election and the enqueue of the work must be within + * the same RCU read side section so that when the workqueue pool + * is flushed after a housekeeping cpumask update, further readers + * are guaranteed to queue the probing work to the appropriate + * targets. + */ + rcu_read_lock(); cpumask_and(wq_domain_mask, housekeeping_cpumask(HK_TYPE_WQ), housekeeping_cpumask(HK_TYPE_DOMAIN)); =20 cpu =3D cpumask_any_and(cpumask_of_node(node), wq_domain_mask); + if (cpu < nr_cpu_ids) { + schedule_work_on(cpu, &arg.work); + rcu_read_unlock(); + flush_work(&arg.work); + error =3D arg.ret; + } else { + rcu_read_unlock(); + error =3D local_pci_probe(&ddi); + } + free_cpumask_var(wq_domain_mask); + destroy_work_on_stack(&arg.work); } - - if (cpu < nr_cpu_ids) - error =3D work_on_cpu(cpu, local_pci_probe, &ddi); - else - error =3D local_pci_probe(&ddi); out: dev->is_probed =3D 0; cpu_hotplug_enable(); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7054234D4FD; Wed, 5 Nov 2025 21:04:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376658; cv=none; b=opz2a/PUvXppAGuh0h4rkJqgHReyF7pfr9l2d8rd+C6UX7aacZuIYCcbxeILxJu6rrNORcaWwRG8WgCbjxlopysI1Otjiub/+987d0oPtgIf3q19pxCB46aTIOHltrLpTmJkkr1GfROo5POFzPoDvC0F0SvGgJoEOE43jJrDJ2Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376658; c=relaxed/simple; bh=bjPVH71yYHRT38x+a2qhkcGYmz0Liwe3xJMHVBXj3C4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iitprQ9BLUHabfORjsaefSQQuVEAJ4RfgarSWXG1Qxum1F7vWNhb1bnHLXL5UM8Ap12n6y0SQy52SCCAYJcbQJokuzm328OTuUZ/99L6tcLFFocOWSUYZ6pkyUiFUyqnfW0BJooULDBnizSafZXOSs02lGlAxRU+sBjSuKv4vjk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=KDYIWONq; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KDYIWONq" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7ED49C19425; Wed, 5 Nov 2025 21:04:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376658; bh=bjPVH71yYHRT38x+a2qhkcGYmz0Liwe3xJMHVBXj3C4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KDYIWONqcQ26ibQzAQkdgLQwpYLoizfgpJwRPDhmDxEYw8bbVT5NCfytM6LWq/UIk d3X9McPNu5W7bwRl1V4OhMilC81nOdPW0GZNtyyKiVvnVvZwjU3V4eRlGb4PCDwa2w oLLKKfw8RYLb9vZQfM9ezsSGTdJpcGLScxRA+EnGQR74nwdqy+MCCwX4jkhZ/5Hcyj KRU9HMvlEYMMfEozwi7kZ2BnBbFbAgiVNuly1Yw+eyZFhmbmRTg4/XmokIJptUndAK okt1ReWDWQxCFWQKpgouktVbzoKoG6YAN9jd11ycTU0odnnKgj2EXJdv1rMNICKGX4 CB4eyIRSu0e+Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 02/31] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug" Date: Wed, 5 Nov 2025 22:03:18 +0100 Message-ID: <20251105210348.35256-3-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" 1) The commit: 2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug") was added to fix an issue where the hotplug control task (BP) was throttled between CPUHP_AP_IDLE_DEAD and CPUHP_HRTIMERS_PREPARE waiting in the hrtimer blindspot for the bandwidth callback queued in the dead CPU. 2) Later on, the commit: 38685e2a0476 ("cpu/hotplug: Don't offline the last non-isolated CPU") plugged on the target selection for the workqueue offloaded CPU down process to prevent from destroying the last CPU domain. 3) Finally: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earl= ier") removed entirely the conditions for the race exposed and partially fixed in 1). The offloading of the CPU down process to a workqueue on another CPU then becomes unnecessary. But the last CPU belonging to scheduler domains must still remain online. Therefore revert the now obsolete commit 2b8272ff4a70b866106ae13c36be7ecbef5d5da2 and move the housekeeping check under the cpu_hotplug_lock write held. Since HK_TYPE_DOMAIN will include both isolcpus and cpuset isolated partition, the hotplug lock will synchronize against concurrent cpuset partition updates. Signed-off-by: Frederic Weisbecker --- kernel/cpu.c | 37 +++++++++++-------------------------- 1 file changed, 11 insertions(+), 26 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index db9f6c539b28..453a806af2ee 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1410,6 +1410,16 @@ static int __ref _cpu_down(unsigned int cpu, int tas= ks_frozen, =20 cpus_write_lock(); =20 + /* + * Keep at least one housekeeping cpu onlined to avoid generating + * an empty sched_domain span. + */ + if (cpumask_any_and(cpu_online_mask, + housekeeping_cpumask(HK_TYPE_DOMAIN)) >=3D nr_cpu_ids) { + ret =3D -EBUSY; + goto out; + } + cpuhp_tasks_frozen =3D tasks_frozen; =20 prev_state =3D cpuhp_set_state(cpu, st, target); @@ -1456,22 +1466,8 @@ static int __ref _cpu_down(unsigned int cpu, int tas= ks_frozen, return ret; } =20 -struct cpu_down_work { - unsigned int cpu; - enum cpuhp_state target; -}; - -static long __cpu_down_maps_locked(void *arg) -{ - struct cpu_down_work *work =3D arg; - - return _cpu_down(work->cpu, 0, work->target); -} - static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target) { - struct cpu_down_work work =3D { .cpu =3D cpu, .target =3D target, }; - /* * If the platform does not support hotplug, report it explicitly to * differentiate it from a transient offlining failure. @@ -1480,18 +1476,7 @@ static int cpu_down_maps_locked(unsigned int cpu, en= um cpuhp_state target) return -EOPNOTSUPP; if (cpu_hotplug_disabled) return -EBUSY; - - /* - * Ensure that the control task does not run on the to be offlined - * CPU to prevent a deadlock against cfs_b->period_timer. - * Also keep at least one housekeeping cpu onlined to avoid generating - * an empty sched_domain span. - */ - for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAI= N)) { - if (cpu !=3D work.cpu) - return work_on_cpu(cpu, __cpu_down_maps_locked, &work); - } - return -EBUSY; + return _cpu_down(cpu, 0, target); } =20 static int cpu_down(unsigned int cpu, enum cpuhp_state target) --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDA0034CFB4; Wed, 5 Nov 2025 21:04:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376665; cv=none; b=IB+4Cbb9YzAihstGX0zL0B0xhmKF1pbwPQ0wRmufMEjDZWloH9bb3Pru+Lr3Wt1Ei39YBNj2/oTB6pBj60tBOQmkzbT3x5WIbQjph6DHtB8CqsHJeaFx+9KBiItqeoWhb0Y45HJexbnr7HLoGZiXfVlwkF+eNmQjuhDl4SZtGR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376665; c=relaxed/simple; bh=QDY6OFwPCpaj5eQjhmw6Wwop/7HtBl8sFRDTAL/mseY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Pm86ySoCua8W7eMkjc0ZnBgq7dfCO+mXO3r56zBYd0i497b4PpQrL9aUeHyEQhRl7OKWzNPTAGX83Aa4lkCC540B/D61C52Wst89PcvXGWJjs3jmxR4jOF9PEMLnhWPmmA0cvWeBlqTI9CH8F07ES6C8yXabVCJ1OFa2rINr9/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LTjEW+Lp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LTjEW+Lp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6C864C116B1; Wed, 5 Nov 2025 21:04:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376665; bh=QDY6OFwPCpaj5eQjhmw6Wwop/7HtBl8sFRDTAL/mseY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LTjEW+LpKe78Iwb5IpUCFicQk91WKrouEhNOVkCIWNJWT/IUfA+/1BuMRMTE/GSNf +H7F8egVjXkx/lSP96kC+SuuJLTGF/8XU6ILZHU9JyBzgy3HxvUFPliaEIO6h0Pw4Q NVPHhj8BuptnLIKmapcvKpE/aGeO4jSutNhuT0C4PABWjnpD9Q5cuNIdyWXg3giORC Fi9PFedNEfYKwdFXqkXdRc0iWC1GJiQi2RRoWyt+ov+H9uiPm/BOjklQPfkM6OtIbG I2NCBOxhuNhTQcrNQ8aEH7WXQtqQYaJPe3rmjJTxzn9fdSN39b2M5jDqPW6Yafqx0c MhWXAmuKuHXaw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 03/31] memcg: Prepare to protect against concurrent isolated cpuset change Date: Wed, 5 Nov 2025 22:03:19 +0100 Message-ID: <20251105210348.35256-4-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask will soon be made modifiable at runtime. In order to synchronize against memcg workqueue to make sure that no asynchronous draining is pending or executing on a newly made isolated CPU, target and queue a drain work under the same RCU critical section. Whenever housekeeping will update the HK_TYPE_DOMAIN cpumask, a memcg workqueue flush will also be issued in a further change to make sure that no work remains pending after a CPU has been made isolated. Signed-off-by: Frederic Weisbecker --- mm/memcontrol.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4deda33625f4..1033e52ab6cf 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1971,6 +1971,13 @@ static bool is_memcg_drain_needed(struct memcg_stock= _pcp *stock, return flush; } =20 +static void schedule_drain_work(int cpu, struct work_struct *work) +{ + guard(rcu)(); + if (!cpu_is_isolated(cpu)) + schedule_work_on(cpu, work); +} + /* * Drains all per-CPU charge caches for given root_memcg resp. subtree * of the hierarchy under it. @@ -2000,8 +2007,8 @@ void drain_all_stock(struct mem_cgroup *root_memcg) &memcg_st->flags)) { if (cpu =3D=3D curcpu) drain_local_memcg_stock(&memcg_st->work); - else if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, &memcg_st->work); + else + schedule_drain_work(cpu, &memcg_st->work); } =20 if (!test_bit(FLUSHING_CACHED_CHARGE, &obj_st->flags) && @@ -2010,8 +2017,8 @@ void drain_all_stock(struct mem_cgroup *root_memcg) &obj_st->flags)) { if (cpu =3D=3D curcpu) drain_local_obj_stock(&obj_st->work); - else if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, &obj_st->work); + else + schedule_drain_work(cpu, &obj_st->work); } } migrate_enable(); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E43A02BD5A7; Wed, 5 Nov 2025 21:04:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376674; cv=none; b=hV9PZb25Q2Docpv3S23S7B99J8Xdh6Of0N0BPw0u633MFLXXllz68OOUSx5LahX+7p6R2mML1CNgybWeZdLHyGuVycodO9fGIDPyNjXV0KilFJWUFne6G7ZEm78ublVsFxmRSGxdD6iiXebOAXPOT/SNZ6JB+mMQrZXtiWSEkAA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376674; c=relaxed/simple; bh=ApQa5R07ipGBiVCyA3EgObV++ftyWFoj/ra3rzP5dCY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=usccpBdhZghUIgDAku7z92DqJZqzJPYnQHuazzV6+DqXIZGBnbtgXPVeaXnaVzeI2022LCH1O3Uz0J/N9tYYqQ9hth8GVtOFfcpWyxpwOMo6Ly4Yz9qg6B/2WZ3u9as8/c6ZkcUkPRua7+oF6bAskAncMlljZKZaanykd8EtyYo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cyGGAVyE; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cyGGAVyE" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1FFC7C4AF09; Wed, 5 Nov 2025 21:04:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376673; bh=ApQa5R07ipGBiVCyA3EgObV++ftyWFoj/ra3rzP5dCY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cyGGAVyEEtnTJleIGvRPvdX/Cc6KEP2KDLlXtGtmXgRrPqu8Hr6k4dV2PS4NZXG/b 0TuJ9Xv9ZVXQWSh2Q5s5CnahVU/19PTlWf5bvziWhR6igHAhOvKcyNWxStp0IREsuQ dCvjHLJNUd14dOSieLUuFACfi9H0I3xpfHEEvQVGyxubT+iCI4qTrf+SBosWgX9JwY q8wUhQKNtbk77Gb5lvCUyYfdwPpF6f8ZJlnEwEOgiTX7YW4gUFPYiJIdvr5mvwbwxf m8/DbHKO9hBpJpzZlv5oT+4ck92P+ii+pCIyA06mlA6F7sTvBrnc5dZH921cfMjKM3 ISpiuGCwxLlCw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 04/31] mm: vmstat: Prepare to protect against concurrent isolated cpuset change Date: Wed, 5 Nov 2025 22:03:20 +0100 Message-ID: <20251105210348.35256-5-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask will soon be made modifiable at runtime. In order to synchronize against vmstat workqueue to make sure that no asynchronous vmstat work is pending or executing on a newly made isolated CPU, target and queue a vmstat work under the same RCU read side critical section. Whenever housekeeping will update the HK_TYPE_DOMAIN cpumask, a vmstat workqueue flush will also be issued in a further change to make sure that no work remains pending after a CPU has been made isolated. Signed-off-by: Frederic Weisbecker --- mm/vmstat.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/vmstat.c b/mm/vmstat.c index bb09c032eecf..7afb2981501f 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -2135,11 +2135,13 @@ static void vmstat_shepherd(struct work_struct *w) * infrastructure ever noticing. Skip regular flushing from vmstat_sheph= erd * for all isolated CPUs to avoid interference with the isolated workloa= d. */ - if (cpu_is_isolated(cpu)) - continue; + scoped_guard(rcu) { + if (cpu_is_isolated(cpu)) + continue; =20 - if (!delayed_work_pending(dw) && need_update(cpu)) - queue_delayed_work_on(cpu, mm_percpu_wq, dw, 0); + if (!delayed_work_pending(dw) && need_update(cpu)) + queue_delayed_work_on(cpu, mm_percpu_wq, dw, 0); + } =20 cond_resched(); } --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1F1F34CFDE; Wed, 5 Nov 2025 21:04:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376682; cv=none; b=FBHTn8qfbYXdyQpabIV7OborPyKdAW5uFibt4pb4vKctpIPswTRmDMjSh7YVNCythG1Ka70mvVtC5T2vmO5tLTnPqrnLBGAafj6Z/3Lz0xEColazeKBR7SbnO2goUC7J/1F5hX4snU+6FyLqTIGzBj4hT63rSzqaPt7+3fbofvw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376682; c=relaxed/simple; bh=vCqQjRKewdSvMhojsrACl76twRmsMBvlBV6cm47g7uk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CLOLlAgGPj5IHB7aelndiV09Ln98JchECzF+Tt/p/4r8LTXiDXq4+jFNrowmoZ/Oo/2zBcxzw3+lRQB5uIk/FdfqSoXAftPnkPjWJMGfU++zabsm8+h7QtigaPzvdWdtWsTbjhki2JIU5beq27qSMMr3DWoFhuLvdhyvdby3cV4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T5kGeur4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T5kGeur4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CC92FC116D0; Wed, 5 Nov 2025 21:04:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376681; bh=vCqQjRKewdSvMhojsrACl76twRmsMBvlBV6cm47g7uk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T5kGeur44NFLworbF2dzO61n4fKCzm2Vesu/gMHqpcwG2IH7hRGsS9PE3e3XFJnQC J0x2USmpchyTqj1LejPMzaUkNNfX70M3i2D7ma/ibICB95gMSZHSQ1EFwshSOEi0w3 Y2xGpqy0WEnVnV5hr5rHANSN95LQq4K0VBQdv8Bpc5HhxD51Fp/g/DRMKEUmJSI5CF B9gp8T6FWJNEKckRpqLNdjGecevv0JO7AyVADISC9OI6I7LsupnxmUoheiOeM/E5ra KXQhbVGao3xX3Opx9iLWtpyIpgQtuSCz1guXSJCOca83e+ubrEX1RpgKRH5FBPyQ5/ wWGOm2PHDKFtQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 05/31] sched/isolation: Save boot defined domain flags Date: Wed, 5 Nov 2025 22:03:21 +0100 Message-ID: <20251105210348.35256-6-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN will soon integrate not only boot defined isolcpus=3D CPUs but also cpuset isolated partitions. Housekeeping still needs a way to record what was initially passed to isolcpus=3D in order to keep these CPUs isolated after a cpuset isolated partition is modified or destroyed while containing some of them. Create a new HK_TYPE_DOMAIN_BOOT to keep track of those. Signed-off-by: Frederic Weisbecker Reviewed-by: Phil Auld --- include/linux/sched/isolation.h | 4 ++++ kernel/sched/isolation.c | 5 +++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index d8501f4709b5..109a2149e21a 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -7,8 +7,12 @@ #include =20 enum hk_type { + /* Revert of boot-time isolcpus=3D argument */ + HK_TYPE_DOMAIN_BOOT, HK_TYPE_DOMAIN, + /* Revert of boot-time isolcpus=3Dmanaged_irq argument */ HK_TYPE_MANAGED_IRQ, + /* Revert of boot-time nohz_full=3D or isolcpus=3Dnohz arguments */ HK_TYPE_KERNEL_NOISE, HK_TYPE_MAX, =20 diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index a4cf17b1fab0..8690fb705089 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -11,6 +11,7 @@ #include "sched.h" =20 enum hk_flags { + HK_FLAG_DOMAIN_BOOT =3D BIT(HK_TYPE_DOMAIN_BOOT), HK_FLAG_DOMAIN =3D BIT(HK_TYPE_DOMAIN), HK_FLAG_MANAGED_IRQ =3D BIT(HK_TYPE_MANAGED_IRQ), HK_FLAG_KERNEL_NOISE =3D BIT(HK_TYPE_KERNEL_NOISE), @@ -216,7 +217,7 @@ static int __init housekeeping_isolcpus_setup(char *str) =20 if (!strncmp(str, "domain,", 7)) { str +=3D 7; - flags |=3D HK_FLAG_DOMAIN; + flags |=3D HK_FLAG_DOMAIN | HK_FLAG_DOMAIN_BOOT; continue; } =20 @@ -246,7 +247,7 @@ static int __init housekeeping_isolcpus_setup(char *str) =20 /* Default behaviour for isolcpus without flags */ if (!flags) - flags |=3D HK_FLAG_DOMAIN; + flags |=3D HK_FLAG_DOMAIN | HK_FLAG_DOMAIN_BOOT; =20 return housekeeping_setup(str, flags); } --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C38ED34CFDE; Wed, 5 Nov 2025 21:04:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376689; cv=none; b=tsEY1RVZoAhpBsda0qqh7WM8Jjt7wXe/yQccMurwZvKsFW4lKtJOSxywfrh1UbOULkmLIoYavfewcnEipaLqRwsW8sybdaZTxCbH/KdstCDTOsSH8Abv4B2OTBom+1OAjQMF542BqnjoVFmu81Fm3vAPvQHevATnyvisI312/fU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376689; c=relaxed/simple; bh=4bRWbV0YuzmDy1B/sRgCWln6RMh7KL9GReKc3Nnqm6c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p2gNhtYbD5ljrqABwfFyEx8DTZpRfo81JX/vfif6i33AzcCg9C1cI3GUMbJKetQRt0aoa/qPT2m6o4BeJb+taMVoMhLI7Fmdtkz4uE2JrucyM7Mevo/27/TPPFfLtTijMbZANgJAFbIGYWTLoYbWqFwH0m0IqzmCn3aeyXdKHkI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=W3jxBxnc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="W3jxBxnc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F24B5C4CEF5; Wed, 5 Nov 2025 21:04:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376689; bh=4bRWbV0YuzmDy1B/sRgCWln6RMh7KL9GReKc3Nnqm6c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=W3jxBxnchXPJZKH2so0QCGrjyVavB6XgcFLZnwbf0ylrkboEezz9X/2eMG8q91HHH IAy92pFmbbdtvG33O3dU7hzjvMro+XMvvQrNO/4OHnD9UvLBMJlxfs33g9qiYG2hYK Z0ilCk2hxybBfgiC79MXfe3wp5DcBV55tZ8C2OtTr6lrdTxYZyW0YhSWQqXLYhiguS KJRtlYb248Pfs9ZOvisxol44RbREaI0m1UwUFynnMWwx9v9jprD0lSiNDM6UD69QkS z51fmf04HQKK4hqG5yCwVC8LmdU7H4H6mQNm+oJwCuT4iwU5cBY2Mo98ASdwndibIo jTUvvHb6LCGVg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 06/31] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Date: Wed, 5 Nov 2025 22:03:22 +0100 Message-ID: <20251105210348.35256-7-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" boot_hk_cpus is an ad-hoc copy of HK_TYPE_DOMAIN_BOOT. Remove it and use the official version. Signed-off-by: Frederic Weisbecker Reviewed-by: Phil Auld Reviewed-by: Chen Ridong --- kernel/cgroup/cpuset.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 52468d2c178a..8595f1eadf23 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -81,12 +81,6 @@ static cpumask_var_t subpartitions_cpus; */ static cpumask_var_t isolated_cpus; =20 -/* - * Housekeeping (HK_TYPE_DOMAIN) CPUs at boot - */ -static cpumask_var_t boot_hk_cpus; -static bool have_boot_isolcpus; - /* List of remote partition root children */ static struct list_head remote_children; =20 @@ -1686,15 +1680,16 @@ static void remote_cpus_update(struct cpuset *cs, s= truct cpumask *xcpus, * @new_cpus: cpu mask * Return: true if there is conflict, false otherwise * - * CPUs outside of boot_hk_cpus, if defined, can only be used in an + * CPUs outside of HK_TYPE_DOMAIN_BOOT, if defined, can only be used in an * isolated partition. */ static bool prstate_housekeeping_conflict(int prstate, struct cpumask *new= _cpus) { - if (!have_boot_isolcpus) + if (!housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) return false; =20 - if ((prstate !=3D PRS_ISOLATED) && !cpumask_subset(new_cpus, boot_hk_cpus= )) + if ((prstate !=3D PRS_ISOLATED) && + !cpumask_subset(new_cpus, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT))) return true; =20 return false; @@ -3824,12 +3819,9 @@ int __init cpuset_init(void) =20 BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL)); =20 - have_boot_isolcpus =3D housekeeping_enabled(HK_TYPE_DOMAIN); - if (have_boot_isolcpus) { - BUG_ON(!alloc_cpumask_var(&boot_hk_cpus, GFP_KERNEL)); - cpumask_copy(boot_hk_cpus, housekeeping_cpumask(HK_TYPE_DOMAIN)); - cpumask_andnot(isolated_cpus, cpu_possible_mask, boot_hk_cpus); - } + if (housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) + cpumask_andnot(isolated_cpus, cpu_possible_mask, + housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); =20 return 0; } --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ABEB34D4F3; Wed, 5 Nov 2025 21:04:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376697; cv=none; b=IC7rJXTUmnd+7PqjmC7TDqMWuMotCe3p/BbpGVDRL6yL3xB9/M0hn3JMeESvZODxiRJEb/HSQuHEjCDw9YglPWwvwQ0gPdvFEdelb/o7o8C1/Bhj8J35LNPNB9BogMjVHlAwlIHPTt23SzjmNWfvrnvRni5t/xgT5F01Wz9XVus= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376697; c=relaxed/simple; bh=SYoV9ZciDsKT7uyTlAb1HvnL9v8aWR3nKq+oz55/MdA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Scelxm7IE1kIyznZQC2KkegnUkhfBYJEbSGbA599KasOEYpNYriHh2wcv6+DbH9vHBKeHOpG2xWZpZZoBzsBt/gwwu/P18k6O+3EyR5tZvApITQE4r6eGfV2mpu8C5xZXWv291JkCrc1L01YFyzpHBwwxe0ZSJL2HxCU3NCko1s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kItBEaDN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kItBEaDN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6B89C116D0; Wed, 5 Nov 2025 21:04:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376697; bh=SYoV9ZciDsKT7uyTlAb1HvnL9v8aWR3nKq+oz55/MdA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kItBEaDNwg83NtBZHN0PY2kO9v4qG9+R9UGhFJXkfz3oeoYG2txCyhHm8tQi1BkuD NWItfaf8aoElYQ+RjvhVLcKuuhff34s1plgMvHf/SEFle9YrkEyqQFq6fErNgLNiEW uYEF73IW+07iSZb619oMuwv6PrVbrMjfvyg+P5grwt/pH++e9JAObxCWjuc5mzZKMy aZ1dugkuLqmPVgs+1gxfJiF2/ehXeFrMHeF5pPRKqaxAvbJcpRFu4H4XfmMUY8PXiD CpEWomCTEysE3h4vCqrk/emFOvpj2sqiG0G0gliDyXtIxhlNYCuVzrEsNmaUc2145Y 6h2pO0qbg3p6A== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 07/31] driver core: cpu: Convert /sys/devices/system/cpu/isolated to use HK_TYPE_DOMAIN_BOOT Date: Wed, 5 Nov 2025 22:03:23 +0100 Message-ID: <20251105210348.35256-8-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make sure /sys/devices/system/cpu/isolated only prints what was passed through the isolcpus=3D parameter before HK_TYPE_DOMAIN will also integrate cpuset isolated partitions. Signed-off-by: Frederic Weisbecker --- drivers/base/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index fa0a2eef93ac..050f87d5b8d4 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -291,7 +291,7 @@ static ssize_t print_cpus_isolated(struct device *dev, return -ENOMEM; =20 cpumask_andnot(isolated, cpu_possible_mask, - housekeeping_cpumask(HK_TYPE_DOMAIN)); + housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); len =3D sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(isolated)); =20 free_cpumask_var(isolated); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDE1534B663; Wed, 5 Nov 2025 21:05:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376707; cv=none; b=BPDXey+ktroqYaPT1cXMp4m43tbKLbleq1Oy7KJMfd/84Odbkvj2jpgeL/gZ390LAZ577M51sIL7KZdQSsR69ETMeQH1KGT2pB5P4pd0dDGRTXOSivDk+7OuqtOrlt5wirOLPLLwhIgHYxy25AaAijXGGTxwZrEhL+BmfWLvrTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376707; c=relaxed/simple; bh=27i30BHoqn/BGY6ARghdzM2ne1ZlXRlOL17r9Y9xQIc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EZJqcAOdD9yQVXhuvNTuLFJo6LNqunTKijDaCKl3SXHl1GExMnTdoMaeMUN4X6Hap40wQ3DcnXb6NG3JxYkliNETM/VL8s0UG3TyTteLBFSthVHHSEGB4l9qA64VcXdibbGKox/mifNeJsqLTa0Xh0dGwh7IO0PT58vdZefUJYo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JjCYtsp3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JjCYtsp3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1FF4C4CEF5; Wed, 5 Nov 2025 21:04:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376705; bh=27i30BHoqn/BGY6ARghdzM2ne1ZlXRlOL17r9Y9xQIc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JjCYtsp3arDlgdSCfdYrEjKJJfAZqUk+QbEPyc2euy9ldASDGzJtMX8bz/0nOmNgj zsZ+vpJgn3AxZvB3v7wgjYYdLkl/efj/scWiKiLXsx7hEPwP+jUEEox9tNkyI+T8LR lRCed7eIknTR5DbF0lhv3ITbCqLMbkvyjAk9puPccy0ayechj8tB8UtvPdguu3Yw0Y BCKDGsuZwbOlB4jsmIR08Lzzhb+iELcBSQO7SkLi+GCXK5mmbiJ9L1dTXuMryEm/UZ XtSns050xBjEYfez0HKsI7LLuRlY0s7cEVB1GqOcLsUXQcxaSKFvbdXSgs2bx+2oc6 SNhAbL25yT/1A== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 08/31] net: Keep ignoring isolated cpuset change Date: Wed, 5 Nov 2025 22:03:24 +0100 Message-ID: <20251105210348.35256-9-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" RPS cpumask can be overriden through sysfs/syctl. The boot defined isolated CPUs are then excluded from that cpumask. However HK_TYPE_DOMAIN will soon integrate cpuset isolated CPUs updates and the RPS infrastructure needs more thoughts to be able to propagate such changes and synchronize against them. Keep handling only what was passed through "isolcpus=3D" for now. Signed-off-by: Frederic Weisbecker --- net/core/net-sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index ca878525ad7c..07624b682b08 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1022,7 +1022,7 @@ static int netdev_rx_queue_set_rps_mask(struct netdev= _rx_queue *queue, int rps_cpumask_housekeeping(struct cpumask *mask) { if (!cpumask_empty(mask)) { - cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_DOMAIN)); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_WQ)); if (cpumask_empty(mask)) return -EINVAL; --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A486E34B1AE; Wed, 5 Nov 2025 21:05:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376713; cv=none; b=LpqocLgt4gBucOC8y8YcePaejDMluBglBghKXYF+ML8q5DlA/vjYAEWDhjg1p8vDMlRonfNOC7KY0ElhOyX4R8rFZAu8LDnjOWwuq3ByDWUZ9UKcAy9nHvgGClnZ/WMbANJkj6p7+Fnxm7ZPku1DNrcC0NNvtxEvtjFeVzZSEPg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376713; c=relaxed/simple; bh=9PqeZuz2EE5nGKqtZVUtfHjP+RBKx+k8UZ4YoNv5ao8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=h2+hfeLpBKDm9BqYj4k7pqhKdE2MHAKtpKyHkv2yooLoYbvB2lvXZh3q/SrQZ12+7DfYYrK2gmsmvmTrPcpOd5MVqWalM6b0lbusoTJfuKctHI8/YYcGiFjzfVdmy7jjKx0oIFM3rdjLZFK5sBpzTCnk6iPqDBdAZfEfzzHt7Cg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hIVpOBpq; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hIVpOBpq" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B1C8DC116B1; Wed, 5 Nov 2025 21:05:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376713; bh=9PqeZuz2EE5nGKqtZVUtfHjP+RBKx+k8UZ4YoNv5ao8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hIVpOBpqfmDE5JQSQL6FKNplt/gJBikhmp1UFRydi+QdTh8oOUYJQlGiB4XmWUTH+ Vqr9dPoZ1kcgFGocAY+e2Wx9dvdHV5fLeJdsh/b30bGh3/fnHbgkXpEqxNce/04c6R DyLBgJ2XOfusrMjIL8NhE2/aR01FlDQSc9QdkYFnrLh3UayYn6US9vZO5mM2muRRM2 y0sUBgROywsAL3lRHGT/BtiVj7xh/6lhWY4kslHyoee8mxsAoMZhQuf9SVDiUh5iIX PuYTBP3315vJsPvAkzANuVXMu+SqS8chEInT75TJVn1gTfvUqrAJADrwdphmm2YvUb rPFeAsWq/Uc6w== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 09/31] block: Protect against concurrent isolated cpuset change Date: Wed, 5 Nov 2025 22:03:25 +0100 Message-ID: <20251105210348.35256-10-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The block subsystem prevents running the workqueue to isolated CPUs, including those defined by cpuset isolated partitions. Since HK_TYPE_DOMAIN will soon contain both and be subject to runtime modifications, synchronize against housekeeping using the relevant lock. For full support of cpuset changes, the block subsystem may need to propagate changes to isolated cpumask through the workqueue in the future. Signed-off-by: Frederic Weisbecker --- block/blk-mq.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index d626d32f6e57..7a6137b15dee 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4240,12 +4240,16 @@ static void blk_mq_map_swqueue(struct request_queue= *q) =20 /* * Rule out isolated CPUs from hctx->cpumask to avoid - * running block kworker on isolated CPUs + * running block kworker on isolated CPUs. + * FIXME: cpuset should propagate further changes to isolated CPUs + * here. */ + rcu_read_lock(); for_each_cpu(cpu, hctx->cpumask) { if (cpu_is_isolated(cpu)) cpumask_clear_cpu(cpu, hctx->cpumask); } + rcu_read_unlock(); =20 /* * Initialize batch roundrobin counts --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58C3734CFAE; Wed, 5 Nov 2025 21:05:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376721; cv=none; b=sNYUHfa37POMGWnF+a2A5+Ml2j5f8reJ4znpoVWxiJ0AZ2hGrMcFeF9MxVjGp8I5WMwWRS/iiLsg/vY8jDmJvuW0KQhH4nxzszgu1KjQnILBfRs3tdDJW4JAZcfjn2h/CGhuvXJJKMQjGm+8SjL13ypdUN5wbJ44GjGQ/xOfxTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376721; c=relaxed/simple; bh=069wMqK9bF6D74YIJY8nZAw2J7YjllAfy3UAu+j/YUM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SWSrX9bJPzTw9IlnKVIxg67ku4U1Gu+J+5VsxK6kGKd//XdJnwb0P93Nr5mlEfM9H/Y+Txxb6tUS7G/liEOgcXP7nragfncwUT7ux3w9XHwg+ciTyIeB0XffEtLAILJk7vz5gpxGl1F9xDnpC0gk3/wqsfFIhZ3JoDBc8flvwzo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TjPK2nVT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TjPK2nVT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A6DBC16AAE; Wed, 5 Nov 2025 21:05:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376721; bh=069wMqK9bF6D74YIJY8nZAw2J7YjllAfy3UAu+j/YUM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TjPK2nVTHpm5UvjRrqC3/iVdJyCu2w8vAKnS5aGQJW33UGGO9u2tYQk8dZwYmSas7 AS1Fq2HfYE07xUSOvClsZpXdjW/NcU2f9X3O1YGo0zlSdDSTnX/4YJxvWxIF6EvsA6 1HoOZJSVB65fyvSzY6HOQQUbrFuT9WvsiOWThdqDs7Vd2rn2VtqKqwNmj9J1f+EOP7 6AQEWF+zq9+uCWgk7nfcuft8ta2vwMKjhXC+7GHobQoc6kg1a6R/Rhgh4HSMk0Rvc7 ajQbnC1myt/oiq3cpaA4PP3eU4zzh01SqlMPVDDuYW12zhVWedDA3qvgsrQ2JhPBUw KsbsHti30dhwQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 10/31] cpu: Provide lockdep check for CPU hotplug lock write-held Date: Wed, 5 Nov 2025 22:03:26 +0100 Message-ID: <20251105210348.35256-11-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" cpuset modifies partitions, including isolated, while holding the cpu hotplug lock read-held. This means that write-holding the CPU hotplug lock is safe to synchronize against housekeeping cpumask changes. Provide a lockdep check to validate that. Signed-off-by: Frederic Weisbecker --- include/linux/cpuhplock.h | 1 + include/linux/percpu-rwsem.h | 1 + kernel/cpu.c | 5 +++++ 3 files changed, 7 insertions(+) diff --git a/include/linux/cpuhplock.h b/include/linux/cpuhplock.h index f7aa20f62b87..286b3ab92e15 100644 --- a/include/linux/cpuhplock.h +++ b/include/linux/cpuhplock.h @@ -13,6 +13,7 @@ struct device; =20 extern int lockdep_is_cpus_held(void); +extern int lockdep_is_cpus_write_held(void); =20 #ifdef CONFIG_HOTPLUG_CPU void cpus_write_lock(void); diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 288f5235649a..c8cb010d655e 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -161,6 +161,7 @@ extern void percpu_free_rwsem(struct percpu_rw_semaphor= e *); __percpu_init_rwsem(sem, #sem, &rwsem_key); \ }) =20 +#define percpu_rwsem_is_write_held(sem) lockdep_is_held_type(sem, 0) #define percpu_rwsem_is_held(sem) lockdep_is_held(sem) #define percpu_rwsem_assert_held(sem) lockdep_assert_held(sem) =20 diff --git a/kernel/cpu.c b/kernel/cpu.c index 453a806af2ee..3b0443f7c486 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -534,6 +534,11 @@ int lockdep_is_cpus_held(void) { return percpu_rwsem_is_held(&cpu_hotplug_lock); } + +int lockdep_is_cpus_write_held(void) +{ + return percpu_rwsem_is_write_held(&cpu_hotplug_lock); +} #endif =20 static void lockdep_acquire_cpus_lock(void) --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66C7C34DB78; Wed, 5 Nov 2025 21:05:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376729; cv=none; b=LiqVZFKXjEYRFTCa08oiExmo/SU63HPdHdzkH7Ja00on7vE+BnHpzhEfruIOGjwTWe2O3HERz96lwGc+NoE76ED+p3cIp1k+6WaszWRgVB+9+fqKVP25qbuGbig/ioDBGaukv4/RViHcw9ku4mmu9L99lHLL1Heer8WL/Fv860g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376729; c=relaxed/simple; bh=VdxniS9PgPy7NwhH2qd2yJdMmQz1qplz7zrJIak9Kiw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=oGv27mnrkOZPQgYN2E0xZah5L3VPTt7nEPEnLtEdhTopjWMR3VHZFltXVhjcG/Q611ES8ZVbyrzNqrsEJLSLVJC4V/pGMaRca/RiAD4dMue40uHiY6GkVPxWdj3GDiueq5bDHHEFaMRGiPoPkVj4YVmUWS3QzbavSDprlLLK5Ag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Ptyyf6KV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Ptyyf6KV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1154C4CEF5; Wed, 5 Nov 2025 21:05:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376729; bh=VdxniS9PgPy7NwhH2qd2yJdMmQz1qplz7zrJIak9Kiw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ptyyf6KVzJ0lA9mUbJMRTUMDBxFbFqsC7D72x5Es055cz5O6o+7X0VRAI46bJaeX1 IR4WE2Lj3E9TiqBgFbWjMrFSWMNGGAy8Bh2T4cDmUyCakgmccGz0OLgjYIo10G4vez X41hI+3qx0OgMbveZU476QadXo3RIrr04BXSlHz9KQ2vGGxtGVUlpa0WFTm+A/bBvp 1BJar5S1ziv66ELBOGUJEuzhVHcaxUtgsuDeEcD/R0uDbBnw/KgyFHlPjQx0YWswAG UWIjs3xA+jcx7za2UJJZ6K84bgaKV1VKCrBdWGGmlKDLgPoerOSfU/sxU0t89VhBaD WWhAqun83nwlA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 11/31] cpuset: Provide lockdep check for cpuset lock held Date: Wed, 5 Nov 2025 22:03:27 +0100 Message-ID: <20251105210348.35256-12-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" cpuset modifies partitions, including isolated, while holding the cpuset mutex. This means that holding the cpuset mutex is safe to synchronize against housekeeping cpumask changes. Provide a lockdep check to validate that. Signed-off-by: Frederic Weisbecker --- include/linux/cpuset.h | 2 ++ kernel/cgroup/cpuset.c | 7 +++++++ 2 files changed, 9 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 2ddb256187b5..051d36fec578 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -18,6 +18,8 @@ #include #include =20 +extern bool lockdep_is_cpuset_held(void); + #ifdef CONFIG_CPUSETS =20 /* diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 8595f1eadf23..aa1ac7bcf2ea 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -279,6 +279,13 @@ void cpuset_full_unlock(void) cpus_read_unlock(); } =20 +#ifdef CONFIG_LOCKDEP +bool lockdep_is_cpuset_held(void) +{ + return lockdep_is_held(&cpuset_mutex); +} +#endif + static DEFINE_SPINLOCK(callback_lock); =20 void cpuset_callback_lock_irq(void) --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4AC3C34DB78; Wed, 5 Nov 2025 21:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376738; cv=none; b=RE+VAJ0l2+bf756uLzWD0BZVywvO9/ZS48xMwxtuQFG0PL1Vgpk5IF3HSY+SV3M0X2Ef6m5ERHcr6IlTd0M7FOkJe8dYFqA9sGy6h6MpN9yVlQfXr96GTZBpTd5ku5zDgaV9ffAyz/Tl3jNXh42dXZIF3vBji7Vbq9RYMEbq2lM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376738; c=relaxed/simple; bh=ZZ+yP0oE5nMuQO76ubRYWghmJmjtLp6J4+mGRrvjt9Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CAqS9LOpefzj2vGVPvchia5p+wcARp9R62JOqQ6xULlKxyEwE2BEXno2Hk5oQbsjwo8pV58X4XdPpMZqAwdl9IHIsWEJzDlR5b6sUlkTEZ4AbwUpk2Ro7i3Zp8PMYTOPm3vD9x2+82uWJcQDu7BpPA0/FHPdjsc++q5m+66D6MU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=iz1hWSwd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="iz1hWSwd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEAC4C16AAE; Wed, 5 Nov 2025 21:05:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376736; bh=ZZ+yP0oE5nMuQO76ubRYWghmJmjtLp6J4+mGRrvjt9Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iz1hWSwdNKtj6UL9xcFuZBN1Kztkumf9wbEcmW5UjeJ2+UdL5tz673c7YJbvnt7/W c/EsY/6qjrORrgn3JfyeSAFSA6yfuYj5MTyOVvQkADRrio6BIJnZRu8YgpBrNgVOTQ eB/pL0KTpvOBNIGNkXD26pikUAULqAaaEwLWkA8jIkjfsA7l7XWyC38MgQB4N6su+k xJuzdIt/BEh9tFL8ETLq2lHz2UCG4Z9ajVxdFB5FH9oWQbaVglAAEfr62yb7elqNOG NuaQirecRskVf5fcOpqOIlmu+m4oVCOMYMKsiFs1KZvDxh3o6NbGgk4qDg0Xqhox9y ldKy71009PlQQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 12/31] sched/isolation: Convert housekeeping cpumasks to rcu pointers Date: Wed, 5 Nov 2025 22:03:28 +0100 Message-ID: <20251105210348.35256-13-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN's cpumask will soon be made modifiable by cpuset. A synchronization mechanism is then needed to synchronize the updates with the housekeeping cpumask readers. Turn the housekeeping cpumasks into RCU pointers. Once a housekeeping cpumask will be modified, the update side will wait for an RCU grace period and propagate the change to interested subsystem when deemed necessary. Signed-off-by: Frederic Weisbecker --- kernel/sched/isolation.c | 58 +++++++++++++++++++++++++--------------- kernel/sched/sched.h | 1 + 2 files changed, 37 insertions(+), 22 deletions(-) diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 8690fb705089..bee6c04be103 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -21,7 +21,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden); EXPORT_SYMBOL_GPL(housekeeping_overridden); =20 struct housekeeping { - cpumask_var_t cpumasks[HK_TYPE_MAX]; + struct cpumask __rcu *cpumasks[HK_TYPE_MAX]; unsigned long flags; }; =20 @@ -33,17 +33,28 @@ bool housekeeping_enabled(enum hk_type type) } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 +const struct cpumask *housekeeping_cpumask(enum hk_type type) +{ + if (static_branch_unlikely(&housekeeping_overridden)) { + if (housekeeping.flags & BIT(type)) { + return rcu_dereference_check(housekeeping.cpumasks[type], 1); + } + } + return cpu_possible_mask; +} +EXPORT_SYMBOL_GPL(housekeeping_cpumask); + int housekeeping_any_cpu(enum hk_type type) { int cpu; =20 if (static_branch_unlikely(&housekeeping_overridden)) { if (housekeeping.flags & BIT(type)) { - cpu =3D sched_numa_find_closest(housekeeping.cpumasks[type], smp_proces= sor_id()); + cpu =3D sched_numa_find_closest(housekeeping_cpumask(type), smp_process= or_id()); if (cpu < nr_cpu_ids) return cpu; =20 - cpu =3D cpumask_any_and_distribute(housekeeping.cpumasks[type], cpu_onl= ine_mask); + cpu =3D cpumask_any_and_distribute(housekeeping_cpumask(type), cpu_onli= ne_mask); if (likely(cpu < nr_cpu_ids)) return cpu; /* @@ -59,28 +70,18 @@ int housekeeping_any_cpu(enum hk_type type) } EXPORT_SYMBOL_GPL(housekeeping_any_cpu); =20 -const struct cpumask *housekeeping_cpumask(enum hk_type type) -{ - if (static_branch_unlikely(&housekeeping_overridden)) - if (housekeeping.flags & BIT(type)) - return housekeeping.cpumasks[type]; - return cpu_possible_mask; -} -EXPORT_SYMBOL_GPL(housekeeping_cpumask); - void housekeeping_affine(struct task_struct *t, enum hk_type type) { if (static_branch_unlikely(&housekeeping_overridden)) if (housekeeping.flags & BIT(type)) - set_cpus_allowed_ptr(t, housekeeping.cpumasks[type]); + set_cpus_allowed_ptr(t, housekeeping_cpumask(type)); } EXPORT_SYMBOL_GPL(housekeeping_affine); =20 bool housekeeping_test_cpu(int cpu, enum hk_type type) { - if (static_branch_unlikely(&housekeeping_overridden)) - if (housekeeping.flags & BIT(type)) - return cpumask_test_cpu(cpu, housekeeping.cpumasks[type]); + if (static_branch_unlikely(&housekeeping_overridden) && housekeeping.flag= s & BIT(type)) + return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); return true; } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); @@ -96,20 +97,33 @@ void __init housekeeping_init(void) =20 if (housekeeping.flags & HK_FLAG_KERNEL_NOISE) sched_tick_offload_init(); - + /* + * Realloc with a proper allocator so that any cpumask update + * can indifferently free the old version with kfree(). + */ for_each_set_bit(type, &housekeeping.flags, HK_TYPE_MAX) { + struct cpumask *omask, *nmask =3D kmalloc(cpumask_size(), GFP_KERNEL); + + if (WARN_ON_ONCE(!nmask)) + return; + + omask =3D rcu_dereference(housekeeping.cpumasks[type]); + /* We need at least one CPU to handle housekeeping work */ - WARN_ON_ONCE(cpumask_empty(housekeeping.cpumasks[type])); + WARN_ON_ONCE(cpumask_empty(omask)); + cpumask_copy(nmask, omask); + RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask); + memblock_free(omask, cpumask_size()); } } =20 static void __init housekeeping_setup_type(enum hk_type type, cpumask_var_t housekeeping_staging) { + struct cpumask *mask =3D memblock_alloc_or_panic(cpumask_size(), SMP_CACH= E_BYTES); =20 - alloc_bootmem_cpumask_var(&housekeeping.cpumasks[type]); - cpumask_copy(housekeeping.cpumasks[type], - housekeeping_staging); + cpumask_copy(mask, housekeeping_staging); + RCU_INIT_POINTER(housekeeping.cpumasks[type], mask); } =20 static int __init housekeeping_setup(char *str, unsigned long flags) @@ -162,7 +176,7 @@ static int __init housekeeping_setup(char *str, unsigne= d long flags) =20 for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) { if (!cpumask_equal(housekeeping_staging, - housekeeping.cpumasks[type])) { + housekeeping_cpumask(type))) { pr_warn("Housekeeping: nohz_full=3D must match isolcpus=3D\n"); goto free_housekeeping_staging; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index adfb6e3409d7..2fc1ddde3120 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1467C34DCEC; Wed, 5 Nov 2025 21:05:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376745; cv=none; b=fDS9+oDIGi1fdHgZFZQ/lBX5O/LJepiznQEEIEAQzsDVl5kzfP0+8TutitObprPjH1D3LKmqniT0bo4O9nniwe/rNmymFhZ/0KLzb6pXx6CIVyxt+EMHMb3zY64KPzGxxb1YfYbphliXbsXsKtvYeiBg+OMR8RVT51VNjTL7fh0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376745; c=relaxed/simple; bh=NSCbW1jqRDXNXOPN1q+JI6ONLqfc5mcXxsKSGojX9XI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ebfKXD88JsG5vrwfaA1WkRjGQiOVEXYTMBMZzTh1WjSLL9K2y0T7JGczBgCf+Ius4N26HR/K8i6ObdPvcFKnN7/dyiFk1f3XEp/Tlng8OP/lqir6mFF5/x+yb3+XqGErllUSgBeCRnog5Odc60W1dKYZHDlEVoA2IHmG37gNkDI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LWMPtEAp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LWMPtEAp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58A86C116B1; Wed, 5 Nov 2025 21:05:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376744; bh=NSCbW1jqRDXNXOPN1q+JI6ONLqfc5mcXxsKSGojX9XI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LWMPtEAp/SsLSnIk4m9A51hGP5hc8yyJBfe6xKhxmY+YrwzM32WL807kODsfESVMz fF20ZBdm8d/ym9V3Zmx4xpDAk9A6WPScqbcRTBISSmofkLCz3/o2qRSxceU3mk65Lb PMiOQVFgTZ/Ybf229Gsbmwo83ZPVdYeyNiPR7BNF5Sqw6xo4kNThz+82IRPGtzf1+C cKmE2pjXhT/2gMis1KSX4NCGdARFqBpIKATDnY37g1s7cOHae3a8oqWNQFjzCX5uj/ I3/SJvSHJ0AboZHGXpc9Bs80RhWWAN6f5aKr/xGZEGmmsNG1i/DG43cD41sYayUrc7 czbmTyBk5vpaQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 13/31] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Date: Wed, 5 Nov 2025 22:03:29 +0100 Message-ID: <20251105210348.35256-14-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Until now, HK_TYPE_DOMAIN used to only include boot defined isolated CPUs passed through isolcpus=3D boot option. Users interested in also knowing the runtime defined isolated CPUs through cpuset must use different APIs: cpuset_cpu_is_isolated(), cpu_is_isolated(), etc... There are many drawbacks to that approach: 1) Most interested subsystems want to know about all isolated CPUs, not just those defined on boot time. 2) cpuset_cpu_is_isolated() / cpu_is_isolated() are not synchronized with concurrent cpuset changes. 3) Further cpuset modifications are not propagated to subsystems Solve 1) and 2) and centralize all isolated CPUs within the HK_TYPE_DOMAIN housekeeping cpumask. Subsystems can rely on RCU to synchronize against concurrent changes. The propagation mentioned in 3) will be handled in further patches. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 7 +++ kernel/cgroup/cpuset.c | 2 + kernel/sched/isolation.c | 76 ++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 1 + 4 files changed, 80 insertions(+), 6 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index 109a2149e21a..aaf2a672f8dc 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -9,6 +9,11 @@ enum hk_type { /* Revert of boot-time isolcpus=3D argument */ HK_TYPE_DOMAIN_BOOT, + /* + * Same as HK_TYPE_DOMAIN_BOOT but also includes the + * revert of cpuset isolated partitions. As such it + * is always a subset of HK_TYPE_DOMAIN_BOOT. + */ HK_TYPE_DOMAIN, /* Revert of boot-time isolcpus=3Dmanaged_irq argument */ HK_TYPE_MANAGED_IRQ, @@ -35,6 +40,7 @@ extern const struct cpumask *housekeeping_cpumask(enum hk= _type type); extern bool housekeeping_enabled(enum hk_type type); extern void housekeeping_affine(struct task_struct *t, enum hk_type type); extern bool housekeeping_test_cpu(int cpu, enum hk_type type); +extern int housekeeping_update(struct cpumask *mask, enum hk_type type); extern void __init housekeeping_init(void); =20 #else @@ -62,6 +68,7 @@ static inline bool housekeeping_test_cpu(int cpu, enum hk= _type type) return true; } =20 +static inline int housekeeping_update(struct cpumask *mask, enum hk_type t= ype) { return 0; } static inline void housekeeping_init(void) { } #endif /* CONFIG_CPU_ISOLATION */ =20 diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index aa1ac7bcf2ea..b04a4242f2fa 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1403,6 +1403,8 @@ static void update_unbound_workqueue_cpumask(bool iso= lcpus_updated) =20 ret =3D workqueue_unbound_exclude_cpumask(isolated_cpus); WARN_ON_ONCE(ret < 0); + ret =3D housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN); + WARN_ON_ONCE(ret < 0); } =20 /** diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index bee6c04be103..80a5b7c6400c 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -29,18 +29,48 @@ static struct housekeeping housekeeping; =20 bool housekeeping_enabled(enum hk_type type) { - return !!(housekeeping.flags & BIT(type)); + return !!(READ_ONCE(housekeeping.flags) & BIT(type)); } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 +static bool housekeeping_dereference_check(enum hk_type type) +{ + if (IS_ENABLED(CONFIG_LOCKDEP) && type =3D=3D HK_TYPE_DOMAIN) { + /* Cpuset isn't even writable yet? */ + if (system_state <=3D SYSTEM_SCHEDULING) + return true; + + /* CPU hotplug write locked, so cpuset partition can't be overwritten */ + if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held()) + return true; + + /* Cpuset lock held, partitions not writable */ + if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held()) + return true; + + return false; + } + + return true; +} + +static inline struct cpumask *housekeeping_cpumask_dereference(enum hk_typ= e type) +{ + return rcu_dereference_all_check(housekeeping.cpumasks[type], + housekeeping_dereference_check(type)); +} + const struct cpumask *housekeeping_cpumask(enum hk_type type) { + const struct cpumask *mask =3D NULL; + if (static_branch_unlikely(&housekeeping_overridden)) { - if (housekeeping.flags & BIT(type)) { - return rcu_dereference_check(housekeeping.cpumasks[type], 1); - } + if (READ_ONCE(housekeeping.flags) & BIT(type)) + mask =3D housekeeping_cpumask_dereference(type); } - return cpu_possible_mask; + if (!mask) + mask =3D cpu_possible_mask; + return mask; } EXPORT_SYMBOL_GPL(housekeeping_cpumask); =20 @@ -80,12 +110,46 @@ EXPORT_SYMBOL_GPL(housekeeping_affine); =20 bool housekeeping_test_cpu(int cpu, enum hk_type type) { - if (static_branch_unlikely(&housekeeping_overridden) && housekeeping.flag= s & BIT(type)) + if (static_branch_unlikely(&housekeeping_overridden) && + READ_ONCE(housekeeping.flags) & BIT(type)) return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); return true; } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); =20 +int housekeeping_update(struct cpumask *mask, enum hk_type type) +{ + struct cpumask *trial, *old =3D NULL; + + if (type !=3D HK_TYPE_DOMAIN) + return -ENOTSUPP; + + trial =3D kmalloc(cpumask_size(), GFP_KERNEL); + if (!trial) + return -ENOMEM; + + cpumask_andnot(trial, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT), mask); + if (!cpumask_intersects(trial, cpu_online_mask)) { + kfree(trial); + return -EINVAL; + } + + if (!housekeeping.flags) + static_branch_enable(&housekeeping_overridden); + + if (housekeeping.flags & BIT(type)) + old =3D housekeeping_cpumask_dereference(type); + else + WRITE_ONCE(housekeeping.flags, housekeeping.flags | BIT(type)); + rcu_assign_pointer(housekeeping.cpumasks[type], trial); + + synchronize_rcu(); + + kfree(old); + + return 0; +} + void __init housekeeping_init(void) { enum hk_type type; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 2fc1ddde3120..5a44e85d4864 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48BEF34E749; Wed, 5 Nov 2025 21:05:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376752; cv=none; b=VDBAA+P1ALllJIzWP9G+76VuejyXi1lwktMAFhstl4VLGz6Fm59SBj6Kiyxe4w5m0Qbx0pjNNNRxzysgyF7gTZsIwPvNxlH1ZE+W73xOy+gxDiQKwjytgwnOY6/Ks706VA7Se7ymL4yrOCIB+mczVRkdQei5hD8mU3e16StsYIY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376752; c=relaxed/simple; bh=F8RS6HxMEwD+CdqqVcGXcs4QntwP/aQ69bm2ypLXfL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q4OTq+fKdjUfr7V4CmZHpg9ybDJk6eluIBsF2wZPJxAuwGiqkGLIYi/4mkNN8lU3zs7MM3k2/vzkpEEEhcJDbTurGOtF0Nk/SFOcJPo6DyztkJuep0l0xXaheG0Jq1ThsUj/bmGrY31JW4CyX5Wh6FH5Y90i96OiwztXqUkenFU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Knp5BQSS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Knp5BQSS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B768C116C6; Wed, 5 Nov 2025 21:05:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376752; bh=F8RS6HxMEwD+CdqqVcGXcs4QntwP/aQ69bm2ypLXfL0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Knp5BQSSwU501i5GIxnOcbJwTVtyFd53ptT/lFssQbFaYhKGEBfKTkv46MZZ5eSe1 qxuJjt66g7Sq6NeQA6NaWjGbi56uDFYVJbvkh8/lprQJWhDZFDxww7qnbY7RQvWEfY Z55jySVJI3FlCGLh99LtrKlkWqidLK9czXsitcbTscRVs7X079FoC9rzCZ0GfbpRMe XaOAMs+/PWPGsjQXVydlFUcSk0jAgQIS8NvlcbuFOG8eu84588igVhduPlmyPu9p/v UTgVFquC2M3nQSL8WmvFgWOMy0XwEQlYhFjT5Fker0h+UYE/olRnCJKAsljLmgJGLu Zyqg0u8oNPkQw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 14/31] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Date: Wed, 5 Nov 2025 22:03:30 +0100 Message-ID: <20251105210348.35256-15-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask is now modifiable at runtime. In order to synchronize against memcg workqueue to make sure that no asynchronous draining is still pending or executing on a newly made isolated CPU, the housekeeping susbsystem must flush the memcg workqueues. However the memcg workqueues can't be flushed easily since they are queued to the main per-CPU workqueue pool. Solve this with creating a memcg specific pool and provide and use the appropriate flushing API. Acked-by: Shakeel Butt Signed-off-by: Frederic Weisbecker --- include/linux/memcontrol.h | 4 ++++ kernel/sched/isolation.c | 2 ++ kernel/sched/sched.h | 1 + mm/memcontrol.c | 12 +++++++++++- 4 files changed, 18 insertions(+), 1 deletion(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 873e510d6f8d..001200df63cf 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1074,6 +1074,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct = *mm) return id; } =20 +void mem_cgroup_flush_workqueue(void); + extern int mem_cgroup_init(void); #else /* CONFIG_MEMCG */ =20 @@ -1481,6 +1483,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct = *mm) return 0; } =20 +static inline void mem_cgroup_flush_workqueue(void) { } + static inline int mem_cgroup_init(void) { return 0; } #endif /* CONFIG_MEMCG */ =20 diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 80a5b7c6400c..16c912dd91d2 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -145,6 +145,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) =20 synchronize_rcu(); =20 + mem_cgroup_flush_workqueue(); + kfree(old); =20 return 0; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 5a44e85d4864..77034d20b4e8 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -44,6 +44,7 @@ #include #include #include +#include #include #include #include diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1033e52ab6cf..4d1f680a4bb0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -95,6 +95,8 @@ static bool cgroup_memory_nokmem __ro_after_init; /* BPF memory accounting disabled? */ static bool cgroup_memory_nobpf __ro_after_init; =20 +static struct workqueue_struct *memcg_wq __ro_after_init; + static struct kmem_cache *memcg_cachep; static struct kmem_cache *memcg_pn_cachep; =20 @@ -1975,7 +1977,7 @@ static void schedule_drain_work(int cpu, struct work_= struct *work) { guard(rcu)(); if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, work); + queue_work_on(cpu, memcg_wq, work); } =20 /* @@ -5092,6 +5094,11 @@ void mem_cgroup_sk_uncharge(const struct sock *sk, u= nsigned int nr_pages) refill_stock(memcg, nr_pages); } =20 +void mem_cgroup_flush_workqueue(void) +{ + flush_workqueue(memcg_wq); +} + static int __init cgroup_memory(char *s) { char *token; @@ -5134,6 +5141,9 @@ int __init mem_cgroup_init(void) cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL, memcg_hotplug_cpu_dead); =20 + memcg_wq =3D alloc_workqueue("memcg", WQ_PERCPU, 0); + WARN_ON(!memcg_wq); + for_each_possible_cpu(cpu) { INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, drain_local_memcg_stock); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E7F234CFD6; Wed, 5 Nov 2025 21:06:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376761; cv=none; b=pBQzj1cnpu2UqQ405rJkvrL9Ygrb41a2cN1eNI/g/FXFb9b431LPSJyiddlfHXrCbu2JP7CO7B8ZoJyGvHJ1zFs3tMtsVPV2h/5CSrTpRFNbrjw/ofM39PVjkQuo45Kz4kde0HvkEaT/MyN5QPRfVRisDRz/2SjzaINVivfQaSc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376761; c=relaxed/simple; bh=v2zJK6mDWINhu5V45BvrdNTmRnVcCM3RKrdK2Jievec=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DZFOV2wY/5DNqlqlxOhYADKbKVjR8ApBCnacLNXIijt5MnZOm9hDRJ9O7IIaAbUaKAAOuY1P6ZwW2vbp2zci7TuMTLdfP2aHCAmPW3D1s+d1jk2ZtWvsn2/AIEVaCbwdHMnPGj21Y/xz5EA23aPunWM2mL9RTPcasS5igHLmKRk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WOhlsM3F; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WOhlsM3F" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 931BAC116B1; Wed, 5 Nov 2025 21:05:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376760; bh=v2zJK6mDWINhu5V45BvrdNTmRnVcCM3RKrdK2Jievec=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WOhlsM3FFPflzzkT+kungUw4CpxH4WtHCHs2d5HiMKT5nflQAjqznTB9idFhGi9j9 tRztsTxI9koU2H6Z3S6lkIFF9qmFzav/2u8hz+uvQB2wL5RQInBeh3R4ri0XnCSI/L GKTmU+JnRlzeT5LmvFwcbWxVA7NWv3IMe2dOrHv3bm8xf7lqifjGolmaJ9b66D4hCa 98qA5BREUujP5Z+vrdxNj0VnMmqeIVDjPJeHddkqna0zv9syW0oOqFapktxNlVlp4f kGURmDoYUwQZDXTMnR1JDZahFJALUF4V9l+OFovx6EBHHg/uJjmGKxPRiCTeGd9v9S x3UKTy6CIPBGQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 15/31] sched/isolation: Flush vmstat workqueues on cpuset isolated partition change Date: Wed, 5 Nov 2025 22:03:31 +0100 Message-ID: <20251105210348.35256-16-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask is now modifiable at runtime. In order to synchronize against vmstat workqueue to make sure that no asynchronous vmstat work is still pending or executing on a newly made isolated CPU, the housekeeping susbsystem must flush the vmstat workqueues. This involves flushing the whole mm_percpu_wq workqueue, shared with LRU drain, introducing here a welcome side effect. Signed-off-by: Frederic Weisbecker --- include/linux/vmstat.h | 2 ++ kernel/sched/isolation.c | 1 + kernel/sched/sched.h | 1 + mm/vmstat.c | 5 +++++ 4 files changed, 9 insertions(+) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index c287998908bf..a81aa5635b47 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -303,6 +303,7 @@ int calculate_pressure_threshold(struct zone *zone); int calculate_normal_threshold(struct zone *zone); void set_pgdat_percpu_threshold(pg_data_t *pgdat, int (*calculate_pressure)(struct zone *)); +void vmstat_flush_workqueue(void); #else /* CONFIG_SMP */ =20 /* @@ -403,6 +404,7 @@ static inline void __dec_node_page_state(struct page *p= age, static inline void refresh_zone_stat_thresholds(void) { } static inline void cpu_vm_stats_fold(int cpu) { } static inline void quiet_vmstat(void) { } +static inline void vmstat_flush_workqueue(void) { } =20 static inline void drain_zonestat(struct zone *zone, struct per_cpu_zonestat *pzstats) { } diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 16c912dd91d2..8338c9259f4f 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -146,6 +146,7 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) synchronize_rcu(); =20 mem_cgroup_flush_workqueue(); + vmstat_flush_workqueue(); =20 kfree(old); =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 77034d20b4e8..c638fc51fc07 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -68,6 +68,7 @@ #include #include #include +#include #include #include #include diff --git a/mm/vmstat.c b/mm/vmstat.c index 7afb2981501f..506d3ca2e47f 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -2115,6 +2115,11 @@ static void vmstat_shepherd(struct work_struct *w); =20 static DECLARE_DEFERRABLE_WORK(shepherd, vmstat_shepherd); =20 +void vmstat_flush_workqueue(void) +{ + flush_workqueue(mm_percpu_wq); +} + static void vmstat_shepherd(struct work_struct *w) { int cpu; --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0990534D4C6; Wed, 5 Nov 2025 21:06:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376769; cv=none; b=U8ru2TM0UruIWU91aCLrqMVLxPq5yAITWm2TfqEwF86kmgzdoQr+HfCcknVUNL9+3nO3A7C5vV6qd948I0ABPMkNeM7gdywX0/2RRgyXMLuBz4MoODVwJf4keM9nDm81lSljKEB9cIyOz/7+x9wP2+T8BPlI6GJW2/dGQj/X/+M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376769; c=relaxed/simple; bh=JiUppWP1p8g4DP4Hr+L5LcwYlcsK9hZTBPLusxTFDcw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fKXAWkDL9Cbc3VaiDt6+LaPeGY7It1toTKz5m1tZWR+hsdSfCx5yEG36WosXNRnPSeC3Nvve1m6OHY7BBtPEG/dbU71nYHbZGvpiFqyqAxysnbhjyAPYsQMPh9/owm51YqSI/MVlz3TU4VgA5UFjjp/czPwquHx28MBYKV09rkU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SGso82lX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SGso82lX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE3FEC4CEF5; Wed, 5 Nov 2025 21:06:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376768; bh=JiUppWP1p8g4DP4Hr+L5LcwYlcsK9hZTBPLusxTFDcw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SGso82lXWej3gwCMVjG+In6OAuz+oWZIKHcl5zSkrOo3iOJGj80tY/uefIBDRdkJ9 zrvpNvHhAFFpeF34HITfKb6fyYMIuBhrTbWFnLXEVscW2zkZidtqpGCdGuvgz1JgjY dCkPvfRBFOQi0T7YMFHxRMbRqMctTBLmDoh2rTETHdurAo9vyHfYVedSr5ZTErqIb3 G9wx0vk1arjCU5QIM6uwrwG8I1+mTTvOWMMauQjTmn9uCcP8xpqKqsMwOZdmr0Y4wb 84As7/sgKqt1XxVy0Y4Ch2VRL3o/Nsk+cJgGkS7By3pLZ5nSFzJWRuqKlFG9Y92CbU syTh45q8Ls+Cg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 16/31] PCI: Flush PCI probe workqueue on cpuset isolated partition change Date: Wed, 5 Nov 2025 22:03:32 +0100 Message-ID: <20251105210348.35256-17-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask is now modifiable at runtime. In order to synchronize against PCI probe works and make sure that no asynchronous probing is still pending or executing on a newly isolated CPU, the housekeeping subsystem must flush the PCI probe works. However the PCI probe works can't be flushed easily since they are queued to the main per-CPU workqueue pool. Solve this with creating a PCI probe-specific pool and provide and use the appropriate flushing API. Signed-off-by: Frederic Weisbecker --- drivers/pci/pci-driver.c | 17 ++++++++++++++++- include/linux/pci.h | 3 +++ kernel/sched/isolation.c | 2 ++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 7b74d22b20f7..ac86aaec8bcf 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -337,6 +337,8 @@ static int local_pci_probe(struct drv_dev_and_id *ddi) return 0; } =20 +static struct workqueue_struct *pci_probe_wq; + struct pci_probe_arg { struct drv_dev_and_id *ddi; struct work_struct work; @@ -407,7 +409,11 @@ static int pci_call_probe(struct pci_driver *drv, stru= ct pci_dev *dev, cpu =3D cpumask_any_and(cpumask_of_node(node), wq_domain_mask); if (cpu < nr_cpu_ids) { - schedule_work_on(cpu, &arg.work); + struct workqueue_struct *wq =3D pci_probe_wq; + + if (WARN_ON_ONCE(!wq)) + wq =3D system_percpu_wq; + queue_work_on(cpu, wq, &arg.work); rcu_read_unlock(); flush_work(&arg.work); error =3D arg.ret; @@ -425,6 +431,11 @@ static int pci_call_probe(struct pci_driver *drv, stru= ct pci_dev *dev, return error; } =20 +void pci_probe_flush_workqueue(void) +{ + flush_workqueue(pci_probe_wq); +} + /** * __pci_device_probe - check if a driver wants to claim a specific PCI de= vice * @drv: driver to call to check if it wants the PCI device @@ -1760,6 +1771,10 @@ static int __init pci_driver_init(void) { int ret; =20 + pci_probe_wq =3D alloc_workqueue("sync_wq", WQ_PERCPU, 0); + if (!pci_probe_wq) + return -ENOMEM; + ret =3D bus_register(&pci_bus_type); if (ret) return ret; diff --git a/include/linux/pci.h b/include/linux/pci.h index d1fdf81fbe1e..3281c235b895 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1175,6 +1175,7 @@ struct pci_bus *pci_create_root_bus(struct device *pa= rent, int bus, struct pci_ops *ops, void *sysdata, struct list_head *resources); int pci_host_probe(struct pci_host_bridge *bridge); +void pci_probe_flush_workqueue(void); int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax); int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax); void pci_bus_release_busn_res(struct pci_bus *b); @@ -2037,6 +2038,8 @@ static inline int pci_has_flag(int flag) { return 0; } _PCI_NOP_ALL(read, *) _PCI_NOP_ALL(write,) =20 +static inline void pci_probe_flush_workqueue(void) { } + static inline struct pci_dev *pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from) diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 8338c9259f4f..303cc3419ecb 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -8,6 +8,7 @@ * */ #include +#include #include "sched.h" =20 enum hk_flags { @@ -145,6 +146,7 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) =20 synchronize_rcu(); =20 + pci_probe_flush_workqueue(); mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); =20 --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DD5C34EF04; Wed, 5 Nov 2025 21:06:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376777; cv=none; b=TBHOvNg3TvCbP5ZMtf24hJFApRxQtny4iYsQR9NamfTkvbPHZqFFcMs3LoKPwaHGeWULzmXyWcPUibuppCF/BUWxk6hEb9CZcAawtPcpqef1N9/ttfoz12RY5YZgxuwlvNkg8DNcCqbDA0b5QSrpCCYrs+uJa2iKyWtMjqxsehE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376777; c=relaxed/simple; bh=loTWvdxqHDkI7ziyDaTEdsHDniRN0/Kv9B5PIMcBLqU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mQGd9ayQ+IipoXHnhsD9St5MG31QOwYWx3fekeBJGcg05FFaxHZmuxC5ngBatiInhNiFF3RZ6KoxnwjBLSnTIhHnhesz3ngq1vWFLGalQNvK2qB7sHfI6G2ubIk4415NquafQs75K23xxWArK/987AWMVtaGTX8/NMmDM8aGMY8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MyltMr5p; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MyltMr5p" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54997C4CEF5; Wed, 5 Nov 2025 21:06:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376776; bh=loTWvdxqHDkI7ziyDaTEdsHDniRN0/Kv9B5PIMcBLqU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MyltMr5ptM7YrDB923RK+oyskliOi3MIym+9tnhn1S5Pe8fXJ1sRM75av96ENfY/u zkPQYYHQReS8Sz5vZW1GdKfBgZ76lvDmGdqTpZgF3Q4lPGx5tAjEtf+dSQVsE2eMJz 1R03mSwTv62GxYfVA5MvFn6Zp+Dkz7rh3Iq+RAcrjnAF8pycT9cQ0myeZviFgjNvbS h0zbHmKeMJXCU5x6R0+D0nsHQAZnfPNx3mgl/GhsK+SLXwKEpLtYD4wjqst1Z9wQow aeECMugZnWI1usI2r9jgeqGy3wXuLDFeXNQ/DZCS8h8deCdZ4ef9BfmVZH7lPx9scB geOwsWPdq8H5Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 17/31] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Date: Wed, 5 Nov 2025 22:03:33 +0100 Message-ID: <20251105210348.35256-18-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Until now, cpuset would propagate isolated partition changes to workqueues so that unbound workers get properly reaffined. Since housekeeping now centralizes, synchronize and propagates isolation cpumask changes, perform the work from that subsystem for consolidation and consistency purposes. For simplification purpose, the target function is adapted to take the new housekeeping mask instead of the isolated mask. Suggested-by: Tejun Heo Signed-off-by: Frederic Weisbecker --- include/linux/workqueue.h | 2 +- init/Kconfig | 1 + kernel/cgroup/cpuset.c | 14 ++++++-------- kernel/sched/isolation.c | 4 +++- kernel/workqueue.c | 17 ++++++++++------- 5 files changed, 21 insertions(+), 17 deletions(-) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index dabc351cc127..a4749f56398f 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -588,7 +588,7 @@ struct workqueue_attrs *alloc_workqueue_attrs_noprof(vo= id); void free_workqueue_attrs(struct workqueue_attrs *attrs); int apply_workqueue_attrs(struct workqueue_struct *wq, const struct workqueue_attrs *attrs); -extern int workqueue_unbound_exclude_cpumask(cpumask_var_t cpumask); +extern int workqueue_unbound_housekeeping_update(const struct cpumask *hk); =20 extern bool queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work); diff --git a/init/Kconfig b/init/Kconfig index cab3ad28ca49..a1b3a3b66bfc 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1247,6 +1247,7 @@ config CPUSETS bool "Cpuset controller" depends on SMP select UNION_FIND + select CPU_ISOLATION help This option will let you create and manage CPUSETs which allow dynamically partitioning a system into sets of CPUs and diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index b04a4242f2fa..ea102e4695a5 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1392,7 +1392,7 @@ static bool partition_xcpus_del(int old_prs, struct c= puset *parent, return isolcpus_updated; } =20 -static void update_unbound_workqueue_cpumask(bool isolcpus_updated) +static void update_housekeeping_cpumask(bool isolcpus_updated) { int ret; =20 @@ -1401,8 +1401,6 @@ static void update_unbound_workqueue_cpumask(bool iso= lcpus_updated) if (!isolcpus_updated) return; =20 - ret =3D workqueue_unbound_exclude_cpumask(isolated_cpus); - WARN_ON_ONCE(ret < 0); ret =3D housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN); WARN_ON_ONCE(ret < 0); } @@ -1558,7 +1556,7 @@ static int remote_partition_enable(struct cpuset *cs,= int new_prs, list_add(&cs->remote_sibling, &remote_children); cpumask_copy(cs->effective_xcpus, tmp->new_cpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); cpuset_force_rebuild(); cs->prs_err =3D 0; =20 @@ -1599,7 +1597,7 @@ static void remote_partition_disable(struct cpuset *c= s, struct tmpmasks *tmp) compute_excpus(cs, cs->effective_xcpus); reset_partition_data(cs); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); cpuset_force_rebuild(); =20 /* @@ -1668,7 +1666,7 @@ static void remote_cpus_update(struct cpuset *cs, str= uct cpumask *xcpus, if (xcpus) cpumask_copy(cs->exclusive_cpus, xcpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); if (adding || deleting) cpuset_force_rebuild(); =20 @@ -2027,7 +2025,7 @@ static int update_parent_effective_cpumask(struct cpu= set *cs, int cmd, WARN_ON_ONCE(parent->nr_subparts < 0); } spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); =20 if ((old_prs !=3D new_prs) && (cmd =3D=3D partcmd_update)) update_partition_exclusive_flag(cs, new_prs); @@ -3047,7 +3045,7 @@ static int update_prstate(struct cpuset *cs, int new_= prs) else if (isolcpus_updated) isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); =20 /* Force update if switching back to member & update effective_xcpus */ update_cpumasks_hier(cs, &tmpmask, !new_prs); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 303cc3419ecb..bad5fdf7e991 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -121,6 +121,7 @@ EXPORT_SYMBOL_GPL(housekeeping_test_cpu); int housekeeping_update(struct cpumask *mask, enum hk_type type) { struct cpumask *trial, *old =3D NULL; + int err; =20 if (type !=3D HK_TYPE_DOMAIN) return -ENOTSUPP; @@ -149,10 +150,11 @@ int housekeeping_update(struct cpumask *mask, enum hk= _type type) pci_probe_flush_workqueue(); mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); + err =3D workqueue_unbound_housekeeping_update(housekeeping_cpumask(type)); =20 kfree(old); =20 - return 0; + return err; } =20 void __init housekeeping_init(void) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 45320e27a16c..32a436b76137 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -6945,13 +6945,16 @@ static int workqueue_apply_unbound_cpumask(const cp= umask_var_t unbound_cpumask) } =20 /** - * workqueue_unbound_exclude_cpumask - Exclude given CPUs from unbound cpu= mask - * @exclude_cpumask: the cpumask to be excluded from wq_unbound_cpumask + * workqueue_unbound_housekeeping_update - Propagate housekeeping cpumask = update + * @hk: the new housekeeping cpumask * - * This function can be called from cpuset code to provide a set of isolat= ed - * CPUs that should be excluded from wq_unbound_cpumask. + * Update the unbound workqueue cpumask on top of the new housekeeping cpu= mask such + * that the effective unbound affinity is the intersection of the new hous= ekeeping + * with the requested affinity set via nohz_full=3D/isolcpus=3D or sysfs. + * + * Return: 0 on success and -errno on failure. */ -int workqueue_unbound_exclude_cpumask(cpumask_var_t exclude_cpumask) +int workqueue_unbound_housekeeping_update(const struct cpumask *hk) { cpumask_var_t cpumask; int ret =3D 0; @@ -6967,14 +6970,14 @@ int workqueue_unbound_exclude_cpumask(cpumask_var_t= exclude_cpumask) * (HK_TYPE_WQ =E2=88=A9 HK_TYPE_DOMAIN) house keeping mask and rewritten * by any subsequent write to workqueue/cpumask sysfs file. */ - if (!cpumask_andnot(cpumask, wq_requested_unbound_cpumask, exclude_cpumas= k)) + if (!cpumask_and(cpumask, wq_requested_unbound_cpumask, hk)) cpumask_copy(cpumask, wq_requested_unbound_cpumask); if (!cpumask_equal(cpumask, wq_unbound_cpumask)) ret =3D workqueue_apply_unbound_cpumask(cpumask); =20 /* Save the current isolated cpumask & export it via sysfs */ if (!ret) - cpumask_copy(wq_isolated_cpumask, exclude_cpumask); + cpumask_andnot(wq_isolated_cpumask, cpu_possible_mask, hk); =20 mutex_unlock(&wq_pool_mutex); free_cpumask_var(cpumask); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3565134CFD7; Wed, 5 Nov 2025 21:06:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376785; cv=none; b=CKTucnQ2SwL9X/ydI5nnpBgtVMzIss04WwG7f3SJ24GbhXPe/9hYrSF8s0OWr14GLUfVtgXEOpbmXFgDats8ZJTVrMUXW/GZlFcHDT3HgTXA41RAfzWvbef/JCZJfRYGVj4ch1zgxY4ibpRjCphAxvBpJi1R8aEOPMZxwSAXp+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376785; c=relaxed/simple; bh=1PXFAoSc5EaQ1yzCGftJTnGf1OCO/kq+1cRjiff+B0o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=edAW0JZIif02YosnmUkKsDks2FAJbzbKPezYBF/Z8BbRQrcTvlwPeDoobV7Rd9A5auHx7afqtfSPoc3uiG1XXX7Jjio+gTJsngLl+XYq8gyhx1q6M7M/gfWfBR3qLGDb2XGaKq6MNKGNa7DF5MdwSAFJPcCjsmD2KYSi2I7M/C8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rA4n9pw2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rA4n9pw2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3A479C116D0; Wed, 5 Nov 2025 21:06:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376784; bh=1PXFAoSc5EaQ1yzCGftJTnGf1OCO/kq+1cRjiff+B0o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rA4n9pw2FOv+9hxUmMxi6GGt4SJ3mYCQWb06W9IiZHm52r67O+JOAmfjrIMDYzNSR 7hhMjSR+V0LaaEHgGb4EMGkqcbc7bh4k157CK1CGKnWGgH60SQaTfVEcf14bx2R1a7 W6AmRByimN4vv61qfnqKvrVtt84nUxX+ewRIrIERljtsHj4WPpzYQzQ3ro10IQ6UaJ i2NEpKm2VX4EY/sin6yFQF7A7VLkFpIueTpaU6/WWugVABg+bzYoul5wol8HkxJPqI 0HmF/ZV2mQzCZGdW4X6JtSqcQOvv8qs5gwBPd8rNVz7Y8l7f4Pvcd6F+y4NHWaQkyb mwjWG2dqzoj+Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 18/31] cpuset: Remove cpuset_cpu_is_isolated() Date: Wed, 5 Nov 2025 22:03:34 +0100 Message-ID: <20251105210348.35256-19-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The set of cpuset isolated CPUs is now included in HK_TYPE_DOMAIN housekeeping cpumask. There is no usecase left interested in just checking what is isolated by cpuset and not by the isolcpus=3D kernel boot parameter. Signed-off-by: Frederic Weisbecker --- include/linux/cpuset.h | 6 ------ include/linux/sched/isolation.h | 4 +--- kernel/cgroup/cpuset.c | 12 ------------ 3 files changed, 1 insertion(+), 21 deletions(-) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 051d36fec578..a10775a4f702 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -78,7 +78,6 @@ extern void cpuset_lock(void); extern void cpuset_unlock(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mas= k); extern bool cpuset_cpus_allowed_fallback(struct task_struct *p); -extern bool cpuset_cpu_is_isolated(int cpu); extern nodemask_t cpuset_mems_allowed(struct task_struct *p); #define cpuset_current_mems_allowed (current->mems_allowed) void cpuset_init_current_mems_allowed(void); @@ -208,11 +207,6 @@ static inline bool cpuset_cpus_allowed_fallback(struct= task_struct *p) return false; } =20 -static inline bool cpuset_cpu_is_isolated(int cpu) -{ - return false; -} - static inline nodemask_t cpuset_mems_allowed(struct task_struct *p) { return node_possible_map; diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index aaf2a672f8dc..a127629adb32 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -2,7 +2,6 @@ #define _LINUX_SCHED_ISOLATION_H =20 #include -#include #include #include =20 @@ -84,8 +83,7 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) static inline bool cpu_is_isolated(int cpu) { return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || - !housekeeping_test_cpu(cpu, HK_TYPE_TICK) || - cpuset_cpu_is_isolated(cpu); + !housekeeping_test_cpu(cpu, HK_TYPE_TICK); } =20 #endif /* _LINUX_SCHED_ISOLATION_H */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index ea102e4695a5..e19d3375a4ec 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include #include @@ -1405,17 +1404,6 @@ static void update_housekeeping_cpumask(bool isolcpu= s_updated) WARN_ON_ONCE(ret < 0); } =20 -/** - * cpuset_cpu_is_isolated - Check if the given CPU is isolated - * @cpu: the CPU number to be checked - * Return: true if CPU is used in an isolated partition, false otherwise - */ -bool cpuset_cpu_is_isolated(int cpu) -{ - return cpumask_test_cpu(cpu, isolated_cpus); -} -EXPORT_SYMBOL_GPL(cpuset_cpu_is_isolated); - /** * rm_siblings_excl_cpus - Remove exclusive CPUs that are used by sibling = cpusets * @parent: Parent cpuset containing all siblings --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9EA734D4F8; Wed, 5 Nov 2025 21:06:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376792; cv=none; b=DmA0EyEJuHphG/2oulKzXM1UNOlLhq44OAhsZjV33wqLoMp1pqSwdxBXd2KStYX9qtp+9PHtvCHVINdB/DoYYZIvTvt1BBaWgmwEL8U+52UXQhqj31X7OJoAxo+Bpq+SnPlCAZcXLiOldPrWcgGqxbcZNr1X+IMuVz0PzDtCB1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376792; c=relaxed/simple; bh=/4qlBV0o6wHpMCHv+PGJIH0/5p5VMnEM2VldkFFk/hA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kYH7QkPp/aFUYYXOLSFYiJ1U3/fHIsU9eNaPN3PuqUJLZD+Er5YywLcDX5xuEcErve/wwqPSii8HHv27fVXYrXUeB8qQ6b8a7cgVG+UXlTPXXn3DlIYFCT32lag96oRDZQpXApdi4cDG2lT/n4kJH46qe/VoykPko9qtl9dBBTY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WPMVl7Uv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WPMVl7Uv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F01FC4CEF5; Wed, 5 Nov 2025 21:06:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376792; bh=/4qlBV0o6wHpMCHv+PGJIH0/5p5VMnEM2VldkFFk/hA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WPMVl7UvVPMRPQhKg2eY/RDKFC1vZcXdD08AGo6Qq/JtNgo8kPbs91s/W0brc+V5w YScVpXtDvf96TJiGbbd3qjQihx33QG/l0iLxhIfzXvXPIv/C6GBZcpUsAOA5/LUs1n oaHLFsM5/jX9GEIO4TEpslLn6l3LkI9tBTlpdE50e1TEqi8N11MwGL1neMCwf138kP PlCf+SEUHcmKj9utkNTtcf5yDLRO3jFxTVgZtfFtvRXo7um6DMx6SJcAdriB6p90Nq yLFLyG0eKaF2fF2W+V+aALv1nrBq3tM7L8gCnFVt09OqEynVIJmKSvNgTQpXOuXEU1 1rE/ac7aW9QzQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 19/31] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Date: Wed, 5 Nov 2025 22:03:35 +0100 Message-ID: <20251105210348.35256-20-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It doesn't make sense to use nohz_full without also isolating the related CPUs from the domain topology, either through the use of isolcpus=3D or cpuset isolated partitions. And now HK_TYPE_DOMAIN includes all kinds of domain isolated CPUs. This means that HK_TYPE_KERNEL_NOISE (of which HK_TYPE_TICK is only an alias) should always be a subset of HK_TYPE_DOMAIN. Therefore if a CPU is not HK_TYPE_DOMAIN, it shouldn't be HK_TYPE_KERNEL_NOISE either. Testing the former is then enough. Simplify cpu_is_isolated() accordingly. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index a127629adb32..a24acefacf9f 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -82,8 +82,7 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) =20 static inline bool cpu_is_isolated(int cpu) { - return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || - !housekeeping_test_cpu(cpu, HK_TYPE_TICK); + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN); } =20 #endif /* _LINUX_SCHED_ISOLATION_H */ --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5337E34F473; Wed, 5 Nov 2025 21:06:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376800; cv=none; b=rG7EUZPBUwMjIij553SG4UchbLhmOPupR3Khhzkf+CwR7xXXj0oUd7vhFxYRw4cd7HXCLtdXjJ7nuAY2NpjEXWXTUdDNrCpwKAq2F5pxmEgwP/46XvFU/tT1P0wtMNOsxSjVNAx8yEy6hu7LZDLCqBC4/jWxoqriT3MK0J5+oGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376800; c=relaxed/simple; bh=fIAIlX7FO5mGPQBGWuCynXC0SPKY4AIZnRPcce8YI4Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VgdkAZLfckvgIPtzxrG145BL+ebR/L0/V5NozohYD8nHJaRymNMzCBkzBXA1MIwWvLNzC8FKYCDDl6K772a28ORyEWDVGqFqcLCt1GB/G0Bi9/zDkV3WgE+0QYDkc2PNem09JF85mEvOrjKYhcUhgz/M1Iu3JKrvf0aPhYAnyvs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uTniV3S/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uTniV3S/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0FA7FC4CEF5; Wed, 5 Nov 2025 21:06:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376800; bh=fIAIlX7FO5mGPQBGWuCynXC0SPKY4AIZnRPcce8YI4Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uTniV3S/EeJ80AND7KBDsXu8Xpe6JM7FMSiWlVLuYUSx1fZ5Es85QML/UqcAKlExj TBu3d6jiTcVnPTUg6trXffi1yL6+pk2kHaXc967OfABBppOpNR5YB2toKuWySjNocr 1Ukzr9rUGpHh4p8j/EIRfYT7VJZihm/gISAZHj3geJ0Xs5YxKq6o19WtZqjCQ0TkpX c51eNX+XGOtADtu9WKoX3bLS7Mi06TUMXagWf5Ke6icI1nHCDIn0D3kNee0ZMbEPHH HWTE5hGnM2szJSqgHeTuViOa94S94x6lb8KEOyvrSQxjXsBjNj1jUHxi04XQhFi5Bm DVfpNf5l6u1aw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 20/31] PCI: Remove superfluous HK_TYPE_WQ check Date: Wed, 5 Nov 2025 22:03:36 +0100 Message-ID: <20251105210348.35256-21-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It doesn't make sense to use nohz_full without also isolating the related CPUs from the domain topology, either through the use of isolcpus=3D or cpuset isolated partitions. And now HK_TYPE_DOMAIN includes all kinds of domain isolated CPUs. This means that HK_TYPE_KERNEL_NOISE (of which HK_TYPE_WQ is only an alias) should always be a subset of HK_TYPE_DOMAIN. Therefore sane configurations verify: HK_TYPE_KERNEL_NOISE | HK_TYPE_DOMAIN =3D=3D HK_TYPE_DOMAIN Simplify the PCI probe target election accordingly. Signed-off-by: Frederic Weisbecker --- drivers/pci/pci-driver.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index ac86aaec8bcf..e731aaf28c76 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -384,16 +384,9 @@ static int pci_call_probe(struct pci_driver *drv, stru= ct pci_dev *dev, pci_physfn_is_probed(dev)) { error =3D local_pci_probe(&ddi); } else { - cpumask_var_t wq_domain_mask; struct pci_probe_arg arg =3D { .ddi =3D &ddi }; =20 INIT_WORK_ONSTACK(&arg.work, local_pci_probe_callback); - - if (!zalloc_cpumask_var(&wq_domain_mask, GFP_KERNEL)) { - error =3D -ENOMEM; - goto out; - } - /* * The target election and the enqueue of the work must be within * the same RCU read side section so that when the workqueue pool @@ -402,12 +395,9 @@ static int pci_call_probe(struct pci_driver *drv, stru= ct pci_dev *dev, * targets. */ rcu_read_lock(); - cpumask_and(wq_domain_mask, - housekeeping_cpumask(HK_TYPE_WQ), - housekeeping_cpumask(HK_TYPE_DOMAIN)); - cpu =3D cpumask_any_and(cpumask_of_node(node), - wq_domain_mask); + housekeeping_cpumask(HK_TYPE_DOMAIN)); + if (cpu < nr_cpu_ids) { struct workqueue_struct *wq =3D pci_probe_wq; =20 @@ -422,10 +412,9 @@ static int pci_call_probe(struct pci_driver *drv, stru= ct pci_dev *dev, error =3D local_pci_probe(&ddi); } =20 - free_cpumask_var(wq_domain_mask); destroy_work_on_stack(&arg.work); } -out: + dev->is_probed =3D 0; cpu_hotplug_enable(); return error; --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56E892BD5A7; Wed, 5 Nov 2025 21:06:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376808; cv=none; b=OA7lLIs28oC0OZrW4E04+e1Ye/QIgtvJoIewb5I0ZD9HpemyX4S1sx+ghv9Qwo0i0rpl+WBeH29X1BPIFAHXUHhB50HwaqKkyAlfOBA/h2KOak7RkNDImTLUtODY8GyLSZx7o4mbPGjNtcjoaETTtL7+eHGA55D27oufT0letYo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376808; c=relaxed/simple; bh=msWuzDTcgxqxV1VfqrsWjNwgWbBhxXt72Fo9MOZeVHU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mFgrfkr64uPhDzkcKcK7P8e3/l9ercKBFMFysyGw81ii3LebqfeT9qCmSEyxO7/5Ox4yFyBPFVB0eMPwoxTN9pBr5uuSefvTltDwJfi6g1S9V8HHGQeb9Q1CU75ZUEwvDL3viAiOFx5Rom6vGn0ImEGXOTa34ugqYvkMBqdgY2M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OLduUcG/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OLduUcG/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D176C116B1; Wed, 5 Nov 2025 21:06:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376808; bh=msWuzDTcgxqxV1VfqrsWjNwgWbBhxXt72Fo9MOZeVHU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OLduUcG/WX3sS+65TZ/jB9oleAmH7o+Ij6zm6G7vJRQPWAWn/kHN6E9NbiVBTETTK /tSXg4kSg8acavMwtkMnD5cCSx1kzBA38/REmggw6d/dhe6YWgbA/cArl5URBF/xtS g/5opQo0PL9RHjMmGs4LFg9IMBxDNjJsHV8PMg+doyGddTGf1j7R9RdBGuAap2AqwO +X//Jgt6py+or3LRMvYCSO3DJr68NLAU5ynEv1K/hyNIgEWMRTuGBrgurdbL8p52vW id04tI9qFZLhzJmCp7mudgZziC2RzYyNovODMsflC8Jdl6yqEsILrkfkUBWUyRVtKj ZAphLjB/PodZQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 21/31] kthread: Refine naming of affinity related fields Date: Wed, 5 Nov 2025 22:03:37 +0100 Message-ID: <20251105210348.35256-22-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The kthreads preferred affinity related fields use "hotplug" as the base of their naming because the affinity management was initially deemed to deal with CPU hotplug. The scope of this role is going to broaden now and also deal with cpuset isolated partition updates. Switch the naming accordingly. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 31b072e8d427..c4dd967e9e9c 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -35,8 +35,8 @@ static DEFINE_SPINLOCK(kthread_create_lock); static LIST_HEAD(kthread_create_list); struct task_struct *kthreadd_task; =20 -static LIST_HEAD(kthreads_hotplug); -static DEFINE_MUTEX(kthreads_hotplug_lock); +static LIST_HEAD(kthread_affinity_list); +static DEFINE_MUTEX(kthread_affinity_lock); =20 struct kthread_create_info { @@ -69,7 +69,7 @@ struct kthread { /* To store the full name if task comm is truncated. */ char *full_name; struct task_struct *task; - struct list_head hotplug_node; + struct list_head affinity_node; struct cpumask *preferred_affinity; }; =20 @@ -128,7 +128,7 @@ bool set_kthread_struct(struct task_struct *p) =20 init_completion(&kthread->exited); init_completion(&kthread->parked); - INIT_LIST_HEAD(&kthread->hotplug_node); + INIT_LIST_HEAD(&kthread->affinity_node); p->vfork_done =3D &kthread->exited; =20 kthread->task =3D p; @@ -323,10 +323,10 @@ void __noreturn kthread_exit(long result) { struct kthread *kthread =3D to_kthread(current); kthread->result =3D result; - if (!list_empty(&kthread->hotplug_node)) { - mutex_lock(&kthreads_hotplug_lock); - list_del(&kthread->hotplug_node); - mutex_unlock(&kthreads_hotplug_lock); + if (!list_empty(&kthread->affinity_node)) { + mutex_lock(&kthread_affinity_lock); + list_del(&kthread->affinity_node); + mutex_unlock(&kthread_affinity_lock); =20 if (kthread->preferred_affinity) { kfree(kthread->preferred_affinity); @@ -390,9 +390,9 @@ static void kthread_affine_node(void) return; } =20 - mutex_lock(&kthreads_hotplug_lock); - WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); - list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + mutex_lock(&kthread_affinity_lock); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); /* * The node cpumask is racy when read from kthread() but: * - a racing CPU going down will either fail on the subsequent @@ -402,7 +402,7 @@ static void kthread_affine_node(void) */ kthread_fetch_affinity(kthread, affinity); set_cpus_allowed_ptr(current, affinity); - mutex_unlock(&kthreads_hotplug_lock); + mutex_unlock(&kthread_affinity_lock); =20 free_cpumask_var(affinity); } @@ -876,10 +876,10 @@ int kthread_affine_preferred(struct task_struct *p, c= onst struct cpumask *mask) goto out; } =20 - mutex_lock(&kthreads_hotplug_lock); + mutex_lock(&kthread_affinity_lock); cpumask_copy(kthread->preferred_affinity, mask); - WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); - list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); kthread_fetch_affinity(kthread, affinity); =20 /* It's safe because the task is inactive. */ @@ -887,7 +887,7 @@ int kthread_affine_preferred(struct task_struct *p, con= st struct cpumask *mask) do_set_cpus_allowed(p, affinity); raw_spin_unlock_irqrestore(&p->pi_lock, flags); =20 - mutex_unlock(&kthreads_hotplug_lock); + mutex_unlock(&kthread_affinity_lock); out: free_cpumask_var(affinity); =20 @@ -908,9 +908,9 @@ static int kthreads_online_cpu(unsigned int cpu) struct kthread *k; int ret; =20 - guard(mutex)(&kthreads_hotplug_lock); + guard(mutex)(&kthread_affinity_lock); =20 - if (list_empty(&kthreads_hotplug)) + if (list_empty(&kthread_affinity_list)) return 0; =20 if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) @@ -918,7 +918,7 @@ static int kthreads_online_cpu(unsigned int cpu) =20 ret =3D 0; =20 - list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { + list_for_each_entry(k, &kthread_affinity_list, affinity_node) { if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || kthread_is_per_cpu(k->task))) { ret =3D -EINVAL; --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78AB434D4E1; Wed, 5 Nov 2025 21:06:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376816; cv=none; b=FbE/uDVH9p3z3OvyV9c1CBFdZ9I7U8EupxH0LI0KwEzFwnjHM3+YAgyPNBlZmaEAugokJtnq1rKDN5azFA+qMXGY4IKfvmfsjLNkZK5eB3KnbxgwQSrdS1HgSqbOFNf2BRAxZU5SvxOa/5pPrYu0VdmfNy308/lQua9buu383vU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376816; c=relaxed/simple; bh=rc90r/VJRIFUeghTYji80GjIJHgeRfSLzAtAWtKNZdY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Td9iluG47f5twEtCRnu041Aj6TOxZPkhjNkUsMkS1jDkQqZJwOhnE81btX9q8zVQwgrLIs6yMLFnf74FxKVZ8pXp7jkDw86o5UyQAyRbzNmcBNEOUvYH7Js2TjQEIGEcHnpsxsTPP+JGSl1FUV1mc6BY+XzzQaa/YE1ppUIiAKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hz7xihx1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hz7xihx1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 918F2C116B1; Wed, 5 Nov 2025 21:06:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376816; bh=rc90r/VJRIFUeghTYji80GjIJHgeRfSLzAtAWtKNZdY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hz7xihx1glmCXa1xYOTXCOtZCwF0OTEG4MDXz4ESiFKIT/eSzyLgw6tdM/A/8ArrJ IEd6sw+KI8CCzzGQUHk4lbghlqPOFoH0V+bJWqxwNfBrriit2+q0O9a/xMKNAVftG1 pdEtz6CY10kisa6Dvp3UjaeNE1tQMLiHK0nc3QJKfijSBQW1LtHsUNCwdyFZvY3JRx 30ZUh5GLz8QEnuL2xArNJEP1+nfwTiM+E7KRZVU84r9zqRWV8rANivY1p/SX18pPk8 Q13aJlDvbtRaohmyRA5PXJR2bE/8UKI4+NRSo21D37HRxvequjK1Me/ZODpo/5OSaB S5lgY0KgQyIzw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 22/31] kthread: Include unbound kthreads in the managed affinity list Date: Wed, 5 Nov 2025 22:03:38 +0100 Message-ID: <20251105210348.35256-23-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The managed affinity list currently contains only unbound kthreads that have affinity preferences. Unbound kthreads globally affine by default are outside of the list because their affinity is automatically managed by the scheduler (through the fallback housekeeping mask) and by cpuset. However in order to preserve the preferred affinity of kthreads, cpuset will delegate the isolated partition update propagation to the housekeeping and kthread code. Prepare for that with including all unbound kthreads in the managed affinity list. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 70 ++++++++++++++++++++++++++++-------------------- 1 file changed, 41 insertions(+), 29 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index c4dd967e9e9c..b4794241420f 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -365,9 +365,10 @@ static void kthread_fetch_affinity(struct kthread *kth= read, struct cpumask *cpum if (kthread->preferred_affinity) { pref =3D kthread->preferred_affinity; } else { - if (WARN_ON_ONCE(kthread->node =3D=3D NUMA_NO_NODE)) - return; - pref =3D cpumask_of_node(kthread->node); + if (kthread->node =3D=3D NUMA_NO_NODE) + pref =3D housekeeping_cpumask(HK_TYPE_KTHREAD); + else + pref =3D cpumask_of_node(kthread->node); } =20 cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_KTHREAD)); @@ -380,32 +381,29 @@ static void kthread_affine_node(void) struct kthread *kthread =3D to_kthread(current); cpumask_var_t affinity; =20 - WARN_ON_ONCE(kthread_is_per_cpu(current)); + if (WARN_ON_ONCE(kthread_is_per_cpu(current))) + return; =20 - if (kthread->node =3D=3D NUMA_NO_NODE) { - housekeeping_affine(current, HK_TYPE_KTHREAD); - } else { - if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) { - WARN_ON_ONCE(1); - return; - } - - mutex_lock(&kthread_affinity_lock); - WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); - list_add_tail(&kthread->affinity_node, &kthread_affinity_list); - /* - * The node cpumask is racy when read from kthread() but: - * - a racing CPU going down will either fail on the subsequent - * call to set_cpus_allowed_ptr() or be migrated to housekeepers - * afterwards by the scheduler. - * - a racing CPU going up will be handled by kthreads_online_cpu() - */ - kthread_fetch_affinity(kthread, affinity); - set_cpus_allowed_ptr(current, affinity); - mutex_unlock(&kthread_affinity_lock); - - free_cpumask_var(affinity); + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) { + WARN_ON_ONCE(1); + return; } + + mutex_lock(&kthread_affinity_lock); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); + /* + * The node cpumask is racy when read from kthread() but: + * - a racing CPU going down will either fail on the subsequent + * call to set_cpus_allowed_ptr() or be migrated to housekeepers + * afterwards by the scheduler. + * - a racing CPU going up will be handled by kthreads_online_cpu() + */ + kthread_fetch_affinity(kthread, affinity); + set_cpus_allowed_ptr(current, affinity); + mutex_unlock(&kthread_affinity_lock); + + free_cpumask_var(affinity); } =20 static int kthread(void *_create) @@ -924,8 +922,22 @@ static int kthreads_online_cpu(unsigned int cpu) ret =3D -EINVAL; continue; } - kthread_fetch_affinity(k, affinity); - set_cpus_allowed_ptr(k->task, affinity); + + /* + * Unbound kthreads without preferred affinity are already affine + * to housekeeping, whether those CPUs are online or not. So no need + * to handle newly online CPUs for them. + * + * But kthreads with a preferred affinity or node are different: + * if none of their preferred CPUs are online and part of + * housekeeping at the same time, they must be affine to housekeeping. + * But as soon as one of their preferred CPU becomes online, they must + * be affine to them. + */ + if (k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { + kthread_fetch_affinity(k, affinity); + set_cpus_allowed_ptr(k->task, affinity); + } } =20 free_cpumask_var(affinity); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 985BF34AB04; Wed, 5 Nov 2025 21:07:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376826; cv=none; b=qF666jAPbsgbJ/JnX7wOT2UKop/xfRD2UU/vI47NbYxAJysbCH/ZNb5wrGvi4EZkCGkO0rTQP4EZTfae+x4+iuKjLWHTI8C2RBC9nqMhlqigqlR9VUTIISPmi9Oxc5aJWqVxiFtNVj+Zn0zQHMoSLsnU225yFcyk5SDW4iayaaM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376826; c=relaxed/simple; bh=Edp/l5R7ShU52gSdBQzYrc5ERchBT8H20BRz7+yt6ag=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PZkxtXuQz+rJibvwij8WiGS9zIseakOVtODCWGJNJslDuOdx+5/jTqIXfl2Jzihdm5WblpEty8lApEnjhjZize8CiNUyPtHvk4PUet6DFesXQsDkOmJBtCckNnuO6NnUUsB7JGHJ0RspX7E0kG6Wvj73TWwTVCAjFbM0JTCD0ds= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MPSSO0ld; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MPSSO0ld" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D8A3C116D0; Wed, 5 Nov 2025 21:06:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376824; bh=Edp/l5R7ShU52gSdBQzYrc5ERchBT8H20BRz7+yt6ag=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MPSSO0ldnwPIBE+qLDpp1WsWWdAtLlye9zEuu09MzwKOcsHq9395lCttXRPZS6Oy2 QOn05QDbGG/4CbYqnqKtnWdiOdxjEzs+mK9npTMOPEuZDA9bFtQETVdHAEUjsc1pzA 1pTFXjXqxdXU9XG1PauidLdVW/EBkGM/kCy4qCxc4IXzIWW3k1VFeMsW2ydpAdJHBt Fo8USBYiZkFbocLeiLzW5xWCg3f66bfqZH9ozdEguU4zJotEVsfyX0yDrIyfWA2rfO L/ATG9JJoKL0bxr5s86A7QA3/4MQD5/SOhQwFxy1r/vPat5+JRw+rwnuJO9VjzTpvR glGPUoTcVb2NQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 23/31] kthread: Include kthreadd to the managed affinity list Date: Wed, 5 Nov 2025 22:03:39 +0100 Message-ID: <20251105210348.35256-24-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The unbound kthreads affinity management performed by cpuset is going to be imported to the kthread core code for consolidation purposes. Treat kthreadd just like any other kthread. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index b4794241420f..86abfbc21bb0 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -820,12 +820,13 @@ int kthreadd(void *unused) /* Setup a clean context for our children to inherit. */ set_task_comm(tsk, comm); ignore_signals(tsk); - set_cpus_allowed_ptr(tsk, housekeeping_cpumask(HK_TYPE_KTHREAD)); set_mems_allowed(node_states[N_MEMORY]); =20 current->flags |=3D PF_NOFREEZE; cgroup_init_kthreadd(); =20 + kthread_affine_node(); + for (;;) { set_current_state(TASK_INTERRUPTIBLE); if (list_empty(&kthread_create_list)) --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 382D6337BB2; Wed, 5 Nov 2025 21:07:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376832; cv=none; b=XO70Vamx8nV1lb00FFg2k9X066UMk/fUlc0tCpeJDeGu8oZ+E0xrFCM1WW6n+xaktARqn2jQUQDU2At2wWPAV26EBnwNYfNC3RbFXymUClzMeZBpHC5+4znwRIfuHdBH+YBVniwdNHeBrUfnbG0BVtTZOKEIBteatS41Ay+G9Vs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376832; c=relaxed/simple; bh=J6j4zsULKpjy2xGkRdPN0GqxRFQVWDdmCEJclS8fknA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tp40LY3qV4op9IGVhMvOUZIoRuQFmGsmp4QnvwcTCLgZ9MpKUXG7POWuxWRGxRMDGTh0XNxNpeuN9GtDQDJ4S0HsZQRyjk1Se72J8iEKbcwbOxbi+3hmWRBqCHpDlg9n2r41pOlSdlcuWJ5wkCuYZaqcUfGdXBM1hq4wEftibnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=szIDSeRI; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="szIDSeRI" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81075C4CEF5; Wed, 5 Nov 2025 21:07:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376832; bh=J6j4zsULKpjy2xGkRdPN0GqxRFQVWDdmCEJclS8fknA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=szIDSeRICvuPVa22uckyIj7tLtbwryeVGfvVRmV5zDtNt+6NVBCBqBwYblGbDqHv6 WQdkxwLPuV6izEzNadmWY/Cp9t5y4JB4wOFuPfAabLsKAgXFvGeFbrrhzqeNduk8ES yUOiqND2LZGDY74RENtGR/Bq7PP6DayV2rI64BjaalSyI4LE4lkSiJrfizlKFn1Emx uvFPUHCmqtSHSOY5v+/BkFDW/Z0kokfN1Y/al6wn7QE3P1GrD20T3kbhNhdCZ3GLRg rZ7BiZ/qbZ2XKuvtAYnPwl0IiJ1OfTlvfENurtuaeb+8HQO0rRM7YpPG6V6BEzxd/v N3mdEa+6d4mDw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 24/31] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Date: Wed, 5 Nov 2025 22:03:40 +0100 Message-ID: <20251105210348.35256-25-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Unbound kthreads want to run neither on nohz_full CPUs nor on domain isolated CPUs. And since nohz_full implies domain isolation, checking the latter is enough to verify both. Therefore exclude kthreads from domain isolation. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 86abfbc21bb0..69d70baceba2 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -362,18 +362,20 @@ static void kthread_fetch_affinity(struct kthread *kt= hread, struct cpumask *cpum { const struct cpumask *pref; =20 + guard(rcu)(); + if (kthread->preferred_affinity) { pref =3D kthread->preferred_affinity; } else { if (kthread->node =3D=3D NUMA_NO_NODE) - pref =3D housekeeping_cpumask(HK_TYPE_KTHREAD); + pref =3D housekeeping_cpumask(HK_TYPE_DOMAIN); else pref =3D cpumask_of_node(kthread->node); } =20 - cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_KTHREAD)); + cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_DOMAIN)); if (cpumask_empty(cpumask)) - cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_DOMAIN)); } =20 static void kthread_affine_node(void) --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79F90359F8F; Wed, 5 Nov 2025 21:07:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376840; cv=none; b=LTjd+Ab+pli75Act9HbyF3TcydtMKFJy5788BTrf50e26MEpGmOsilbKxII1dMSL9RTH8/iBHjVITdgCzDPTv3LWHuh7Dh61DcGMbCZiK/m2VznyFcQlaHkiPDuszsG6Os3QMLfc5FMqdetd9HGRuKqCxZOagxasDFepQVDvufo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376840; c=relaxed/simple; bh=Rz519K4RzNFN9C7FTifkrc2s4js15D9og5elk9stWlE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hOjYBwXqSogA4KMDjY3UjeaSePNR5twPcbJxA9UBSPeZ7fDv6BjKB7NWmBcRQGlm5vU54yRe8/y47SMIfRmy82SqBkcIZcC1+bOsX2jOb/lfnXPWvj4DJxK5LpAcF8RUgUQgzeysoDPq7QDZ4bpAjr+j8jtjj5FEU7OHxJW7PRk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TIAbm/F3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TIAbm/F3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 798D5C116D0; Wed, 5 Nov 2025 21:07:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376840; bh=Rz519K4RzNFN9C7FTifkrc2s4js15D9og5elk9stWlE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TIAbm/F3lkoEFLNp3ZLWekoviyl6iWh5KqGoC3jN5pAXOi5L/sWwbTneX1Sd1f2Kl 7zL5xqMjp906phdgY3w8G4Loes9IQaHO6Jg1QGxIqMQsUXfkKAHhFb1mPBnHUnse7X h2mJe/zjqzeSRH2cgUsMRXuxYaohkVkO/lgKbv6DBati8qhfwhtKaoi7hBjPHR4dMv 8b9ApEnTcxqi22/RVPbXSqTskMNkQjcdP1Pb7XbijRLMlZfhJpdGLk1g9A53gP9snv vGCzTSX8nxacEQ8fa/P2hESWu04ejojFuxjWPxaXBvPpcyAYLwEkBrNtfCM7qw2bfC Dnzv5/9V3vOew== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 25/31] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Date: Wed, 5 Nov 2025 22:03:41 +0100 Message-ID: <20251105210348.35256-26-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Tasks that have all their allowed CPUs offline don't want their affinity to fallback on either nohz_full CPUs or on domain isolated CPUs. And since nohz_full implies domain isolation, checking the latter is enough to verify both. Therefore exclude domain isolation from fallback task affinity. Signed-off-by: Frederic Weisbecker --- include/linux/mmu_context.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/mmu_context.h b/include/linux/mmu_context.h index ac01dc4eb2ce..ed3dd0f3fe19 100644 --- a/include/linux/mmu_context.h +++ b/include/linux/mmu_context.h @@ -24,7 +24,7 @@ static inline void leave_mm(void) { } #ifndef task_cpu_possible_mask # define task_cpu_possible_mask(p) cpu_possible_mask # define task_cpu_possible(cpu, p) true -# define task_cpu_fallback_mask(p) housekeeping_cpumask(HK_TYPE_TICK) +# define task_cpu_fallback_mask(p) housekeeping_cpumask(HK_TYPE_DOMAIN) #else # define task_cpu_possible(cpu, p) cpumask_test_cpu((cpu), task_cpu_possib= le_mask(p)) #endif --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41FD634DB59; Wed, 5 Nov 2025 21:07:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376848; cv=none; b=gxV8+hCug3ksK/YbMdyDXzjHksgdOGbTMS82TpW9P10q5m8yNfDxDG1E1xsLl9xhS98UxyFt91nmXbh0rAeFvYErXsJPzpNhwg0L9UpzBVNMYZ5AOYJmp7epUZyJARfY6woGaUno7Y6yFIKYFPlYaj+M/RSl9LXY5mOvngszOx4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376848; c=relaxed/simple; bh=JHjdxV/oITldUpJEMoCsfp/cWcTZCJek1YGVWW127DY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X9RQi6ht7pvrV/t2jrZrtvy1kuVxgO3O9dE8lpJWWtJJ32iLd/vIemFJsXPPd2cS8Pm0pelFikWUKpfHOv/8odQoeky+LOQRVDGYuN1aAGK4aszb4pF7J6ps/qNda1BrtwIWkE0/c5XP8dxdAfwBFQkti9Mnh8r/fvoinYLV57I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UOtgUGl4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UOtgUGl4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 75DB2C116B1; Wed, 5 Nov 2025 21:07:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376847; bh=JHjdxV/oITldUpJEMoCsfp/cWcTZCJek1YGVWW127DY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UOtgUGl4EzIT/CXtpPARtWFryXQQmWL1KW/pmPm22LHMDdgbfer7GuF7zoHKnxHqD TB4QqR5XgJcd301661MTdPJCcxhHdWwK6+v9/a2D5jrHEcrxFjwsmYvJ69GAjnfSXc JmDP/+paNlSwRUwnoWVwq6PHqKb967VDINvF6ALtLmEl5ZK2M4Ofx/bl9AYMVJH9Rl AFMWTf7iYm9QKnVsrpVA/GIn0bAqNBwkTW1syD35Kawwb9Cd2hde1s5GBOrKRTLdqB /ASLAVCihQZyf2B71XvlxP0975zoCeKrez94TdSXcb1rVwMldhzZmmm9/IIfjVVqZs dlHQCiNEdGKYQ== From: Frederic Weisbecker To: LKML Cc: Gabriele Monaco , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Frederic Weisbecker , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 26/31] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Date: Wed, 5 Nov 2025 22:03:42 +0100 Message-ID: <20251105210348.35256-27-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Gabriele Monaco Currently the user can set up isolated cpus via cpuset and nohz_full in such a way that leaves no housekeeping CPU (i.e. no CPU that is neither domain isolated nor nohz full). This can be a problem for other subsystems (e.g. the timer wheel imgration). Prevent this configuration by blocking any assignation that would cause the union of domain isolated cpus and nohz_full to covers all CPUs. Acked-by: Frederic Weisbecker Reviewed-by: Waiman Long Signed-off-by: Gabriele Monaco Signed-off-by: Frederic Weisbecker --- kernel/cgroup/cpuset.c | 63 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e19d3375a4ec..d1a799e361c3 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1327,6 +1327,19 @@ static void isolated_cpus_update(int old_prs, int ne= w_prs, struct cpumask *xcpus cpumask_andnot(isolated_cpus, isolated_cpus, xcpus); } =20 +/* + * isolated_cpus_should_update - Returns if the isolated_cpus mask needs u= pdate + * @prs: new or old partition_root_state + * @parent: parent cpuset + * Return: true if isolated_cpus needs modification, false otherwise + */ +static bool isolated_cpus_should_update(int prs, struct cpuset *parent) +{ + if (!parent) + parent =3D &top_cpuset; + return prs !=3D parent->partition_root_state; +} + /* * partition_xcpus_add - Add new exclusive CPUs to partition * @new_prs: new partition_root_state @@ -1391,6 +1404,42 @@ static bool partition_xcpus_del(int old_prs, struct = cpuset *parent, return isolcpus_updated; } =20 +/* + * isolated_cpus_can_update - check for isolated & nohz_full conflicts + * @add_cpus: cpu mask for cpus that are going to be isolated + * @del_cpus: cpu mask for cpus that are no longer isolated, can be NULL + * Return: false if there is conflict, true otherwise + * + * If nohz_full is enabled and we have isolated CPUs, their combination mu= st + * still leave housekeeping CPUs. + */ +static bool isolated_cpus_can_update(struct cpumask *add_cpus, + struct cpumask *del_cpus) +{ + cpumask_var_t full_hk_cpus; + int res =3D true; + + if (!housekeeping_enabled(HK_TYPE_KERNEL_NOISE)) + return true; + + if (del_cpus && cpumask_weight_and(del_cpus, + housekeeping_cpumask(HK_TYPE_KERNEL_NOISE))) + return true; + + if (!alloc_cpumask_var(&full_hk_cpus, GFP_KERNEL)) + return false; + + cpumask_and(full_hk_cpus, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE), + housekeeping_cpumask(HK_TYPE_DOMAIN)); + cpumask_andnot(full_hk_cpus, full_hk_cpus, isolated_cpus); + cpumask_and(full_hk_cpus, full_hk_cpus, cpu_active_mask); + if (!cpumask_weight_andnot(full_hk_cpus, add_cpus)) + res =3D false; + + free_cpumask_var(full_hk_cpus); + return res; +} + static void update_housekeeping_cpumask(bool isolcpus_updated) { int ret; @@ -1538,6 +1587,9 @@ static int remote_partition_enable(struct cpuset *cs,= int new_prs, if (!cpumask_intersects(tmp->new_cpus, cpu_active_mask) || cpumask_subset(top_cpuset.effective_cpus, tmp->new_cpus)) return PERR_INVCPUS; + if (isolated_cpus_should_update(new_prs, NULL) && + !isolated_cpus_can_update(tmp->new_cpus, NULL)) + return PERR_HKEEPING; =20 spin_lock_irq(&callback_lock); isolcpus_updated =3D partition_xcpus_add(new_prs, NULL, tmp->new_cpus); @@ -1637,6 +1689,9 @@ static void remote_cpus_update(struct cpuset *cs, str= uct cpumask *xcpus, else if (cpumask_intersects(tmp->addmask, subpartitions_cpus) || cpumask_subset(top_cpuset.effective_cpus, tmp->addmask)) cs->prs_err =3D PERR_NOCPUS; + else if (isolated_cpus_should_update(prs, NULL) && + !isolated_cpus_can_update(tmp->addmask, tmp->delmask)) + cs->prs_err =3D PERR_HKEEPING; if (cs->prs_err) goto invalidate; } @@ -1984,6 +2039,12 @@ static int update_parent_effective_cpumask(struct cp= uset *cs, int cmd, return err; } =20 + if (deleting && isolated_cpus_should_update(new_prs, parent) && + !isolated_cpus_can_update(tmp->delmask, tmp->addmask)) { + cs->prs_err =3D PERR_HKEEPING; + return PERR_HKEEPING; + } + /* * Change the parent's effective_cpus & effective_xcpus (top cpuset * only). @@ -2999,6 +3060,8 @@ static int update_prstate(struct cpuset *cs, int new_= prs) * Need to update isolated_cpus. */ isolcpus_updated =3D true; + if (!isolated_cpus_can_update(cs->effective_xcpus, NULL)) + err =3D PERR_HKEEPING; } else { /* * Switching back to member is always allowed even if it --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CF4F337B9D; Wed, 5 Nov 2025 21:07:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376856; cv=none; b=oasv1v5HY4HBMEzh6Vz1JdAqSJqzKvoDo1mGcad/vuYATx57y9vKyWWLmk6OiSGZziyGtHj4FQ+Gp7KYuKB3T3cRvyD01b0yWHn5aibkYks+0nFEW9N9Ptz40YzYE+PPFTKSWoYhe+UUjKAdlWYQmmfJ0ZdOUDhgZm5u1cjrStY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376856; c=relaxed/simple; bh=TCF8MVyq430j5XDUVoolby9dsM0WVQ59dd/A7DweIIA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aEwevXBm73c9hexEIxZ8absNHuiZIEp64RtkzVRL9tD4Sm5R/YUyX700NC1bHJbaKXhlQRq/9UsrvnWxur9Ddxf4YE9xlnfkgTEG/TEcWlZJlYFKDqfZhj8XftgPb1Rpuo9zVmkGOpczx+lvvYitm5ORiTQOaFnf/6jgdd6cwaU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=o0fFnugQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="o0fFnugQ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 414CEC4CEF5; Wed, 5 Nov 2025 21:07:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376855; bh=TCF8MVyq430j5XDUVoolby9dsM0WVQ59dd/A7DweIIA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o0fFnugQTFG/RXlQvZAg0LmxL1rabmJvLKEI4i8jnG6+49xPoRoH26kAR0uHunR+2 ZBcBQWJjRgJdZGXcxSSDAWoKhlJLOYizJICcLoAqcUpNI65ySZq1Jx9WH1trZF4POQ 7WuQTmc+rMjo52wqF1ZQ+MRo9RAQslGBt7JQA6Q7TMwtF/Og5JisgtD9+q2KrhYdLQ Lhos2xSI9YbHGL1zN40m/fuIC2OLXxkzUGebXsoL/50URqNzfH3kG4I44qL5FQ5nrn AlYgoiUzof7XSK2JcGXomAMMMDNb6wHnnXpW1qgjj7vWMjH7YnJPJD5jl9XRQVdHKg n2LaM+LJdYrnQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 27/31] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Date: Wed, 5 Nov 2025 22:03:43 +0100 Message-ID: <20251105210348.35256-28-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When none of the allowed CPUs of a task are online, it gets migrated to the fallback cpumask which is all the non nohz_full CPUs. However just like nohz_full CPUs, domain isolated CPUs don't want to be disturbed by tasks that have lost their CPU affinities. And since nohz_full rely on domain isolation to work correctly, the housekeeping mask of domain isolated CPUs should always be a superset of the housekeeping mask of nohz_full CPUs (there can be CPUs that are domain isolated but not nohz_full, OTOH there shouldn't be nohz_full CPUs that are not domain isolated): HK_TYPE_DOMAIN | HK_TYPE_KERNEL_NOISE =3D=3D HK_TYPE_DOMAIN Therefore use HK_TYPE_DOMAIN as the appropriate fallback target for tasks and since this cpumask can be modified at runtime, make sure that 32 bits support CPUs on ARM64 mismatched systems are not isolated by cpusets. Signed-off-by: Frederic Weisbecker --- arch/arm64/kernel/cpufeature.c | 18 +++++++++++++++--- include/linux/cpu.h | 4 ++++ kernel/cgroup/cpuset.c | 17 ++++++++++++++--- 3 files changed, 33 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 5ed401ff79e3..4296b149ccf0 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1655,6 +1655,18 @@ has_cpuid_feature(const struct arm64_cpu_capabilitie= s *entry, int scope) return feature_matches(val, entry); } =20 +/* + * 32 bits support CPUs can't be isolated because tasks may be + * arbitrarily affine to them, defeating the purpose of isolation. + */ +bool arch_isolated_cpus_can_update(struct cpumask *new_cpus) +{ + if (static_branch_unlikely(&arm64_mismatched_32bit_el0)) + return !cpumask_intersects(cpu_32bit_el0_mask, new_cpus); + else + return true; +} + const struct cpumask *system_32bit_el0_cpumask(void) { if (!system_supports_32bit_el0()) @@ -1668,7 +1680,7 @@ const struct cpumask *system_32bit_el0_cpumask(void) =20 const struct cpumask *task_cpu_fallback_mask(struct task_struct *p) { - return __task_cpu_possible_mask(p, housekeeping_cpumask(HK_TYPE_TICK)); + return __task_cpu_possible_mask(p, housekeeping_cpumask(HK_TYPE_DOMAIN)); } =20 static int __init parse_32bit_el0_param(char *str) @@ -3922,8 +3934,8 @@ static int enable_mismatched_32bit_el0(unsigned int c= pu) bool cpu_32bit =3D false; =20 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { - if (!housekeeping_cpu(cpu, HK_TYPE_TICK)) - pr_info("Treating adaptive-ticks CPU %u as 64-bit only\n", cpu); + if (!housekeeping_cpu(cpu, HK_TYPE_DOMAIN)) + pr_info("Treating domain isolated CPU %u as 64-bit only\n", cpu); else cpu_32bit =3D true; } diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 487b3bf2e1ea..0b48af25ab5c 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -229,4 +229,8 @@ static inline bool cpu_attack_vector_mitigated(enum cpu= _attack_vectors v) #define smt_mitigations SMT_MITIGATIONS_OFF #endif =20 +struct cpumask; + +bool arch_isolated_cpus_can_update(struct cpumask *new_cpus); + #endif /* _LINUX_CPU_H_ */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index d1a799e361c3..817c07a7a1b4 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1404,14 +1404,22 @@ static bool partition_xcpus_del(int old_prs, struct= cpuset *parent, return isolcpus_updated; } =20 +bool __weak arch_isolated_cpus_can_update(struct cpumask *new_cpus) +{ + return true; +} + /* - * isolated_cpus_can_update - check for isolated & nohz_full conflicts + * isolated_cpus_can_update - check for conflicts against housekeeping and + * CPUs capabilities. * @add_cpus: cpu mask for cpus that are going to be isolated * @del_cpus: cpu mask for cpus that are no longer isolated, can be NULL * Return: false if there is conflict, true otherwise * - * If nohz_full is enabled and we have isolated CPUs, their combination mu= st - * still leave housekeeping CPUs. + * Check for conflicts: + * - If nohz_full is enabled and there are isolated CPUs, their combinatio= n must + * still leave housekeeping CPUs. + * - Architecture has CPU capabilities incompatible with being isolated */ static bool isolated_cpus_can_update(struct cpumask *add_cpus, struct cpumask *del_cpus) @@ -1419,6 +1427,9 @@ static bool isolated_cpus_can_update(struct cpumask *= add_cpus, cpumask_var_t full_hk_cpus; int res =3D true; =20 + if (!arch_isolated_cpus_can_update(add_cpus)) + return false; + if (!housekeeping_enabled(HK_TYPE_KERNEL_NOISE)) return true; =20 --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C028534CFA0; Wed, 5 Nov 2025 21:07:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376863; cv=none; b=LP8USN7mSA3cxpGJwS6/DTv24bIqH050KD3LXKloGd1jxB81LmlwL/CeFHIPPBKZ6iQV+7lqeL8w2kf6xHcDntsMxVD125BslTFkhH4Ku3yu0pWIxRiiTiXgDxPIKGtN6csSCj7eZo4IfowU5QmA4hFuMuzOB/1qsDZWfQ2XGpc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376863; c=relaxed/simple; bh=6jwYeig2zqPrmwGwM4pQrsHIUOv6qqFjI/kog2VhZlE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=BgLz+Xe/VVEZA4bOcbzxgTA9j/Np3x0XAZd8+M4d2RxH1xx0iKpf225dt4tg5zQfRQPXC13I9sWaUxMkV63KedSg0uoRjb3JpF1+8ZU9POlv13E/P/PwfWKW1p8NBpF1qdix11gacMC/KLL/2wj8nPmIjwRfOEqkAItJLKqJ6zA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WLDkMjKl; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WLDkMjKl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0115EC116B1; Wed, 5 Nov 2025 21:07:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376863; bh=6jwYeig2zqPrmwGwM4pQrsHIUOv6qqFjI/kog2VhZlE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WLDkMjKlR2GjM6ul10cb70ghHeSr5aremTf7CjKkC/5QKjY04kalvpwuwnKK4I5NY IEO/g7jz17F5MZF1gqOpJNb6UxGmp6JHci/Qa5aPe7GOuNcZsnGK1YtNlkgGRrC+ka DxmVtLqKmpAvFgJvU4RPxh/CZbJHpONR7S74p0VLnz8YRnymh5W+/aGYeh++jcRsgR Js/L3sSGqqg+cb6eDfgLzf69RQ8Pqw98poq7YGmCkDDJT8etX5MyL8Z/gL95TOFsqr BJ5Uly3TzQphQ9EKLsa+FCZImOjNeyUAZRabscSXCaRpMuJc99OamaZpN1Fx7tIPjT 6iPf8Cpfdggkw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 28/31] kthread: Honour kthreads preferred affinity after cpuset changes Date: Wed, 5 Nov 2025 22:03:44 +0100 Message-ID: <20251105210348.35256-29-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When cpuset isolated partitions get updated, unbound kthreads get indifferently affine to all non isolated CPUs, regardless of their individual affinity preferences. For example kswapd is a per-node kthread that prefers to be affine to the node it refers to. Whenever an isolated partition is created, updated or deleted, kswapd's node affinity is going to be broken if any CPU in the related node is not isolated because kswapd will be affine globally. Fix this with letting the consolidated kthread managed affinity code do the affinity update on behalf of cpuset. Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/cgroup/cpuset.c | 5 ++--- kernel/kthread.c | 41 ++++++++++++++++++++++++++++++---------- kernel/sched/isolation.c | 2 ++ 4 files changed, 36 insertions(+), 13 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index 8d27403888ce..c92c1149ee6e 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -100,6 +100,7 @@ void kthread_unpark(struct task_struct *k); void kthread_parkme(void); void kthread_exit(long result) __noreturn; void kthread_complete_and_exit(struct completion *, long) __noreturn; +int kthreads_update_housekeeping(void); =20 int kthreadd(void *unused); extern struct task_struct *kthreadd_task; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 817c07a7a1b4..bc3f18ead7c8 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1182,11 +1182,10 @@ void cpuset_update_tasks_cpumask(struct cpuset *cs,= struct cpumask *new_cpus) =20 if (top_cs) { /* + * PF_KTHREAD tasks are handled by housekeeping. * PF_NO_SETAFFINITY tasks are ignored. - * All per cpu kthreads should have PF_NO_SETAFFINITY - * flag set, see kthread_set_per_cpu(). */ - if (task->flags & PF_NO_SETAFFINITY) + if (task->flags & (PF_KTHREAD | PF_NO_SETAFFINITY)) continue; cpumask_andnot(new_cpus, possible_mask, subpartitions_cpus); } else { diff --git a/kernel/kthread.c b/kernel/kthread.c index 69d70baceba2..f535d4e66a71 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -896,14 +896,7 @@ int kthread_affine_preferred(struct task_struct *p, co= nst struct cpumask *mask) } EXPORT_SYMBOL_GPL(kthread_affine_preferred); =20 -/* - * Re-affine kthreads according to their preferences - * and the newly online CPU. The CPU down part is handled - * by select_fallback_rq() which default re-affines to - * housekeepers from other nodes in case the preferred - * affinity doesn't apply anymore. - */ -static int kthreads_online_cpu(unsigned int cpu) +static int kthreads_update_affinity(bool force) { cpumask_var_t affinity; struct kthread *k; @@ -929,7 +922,8 @@ static int kthreads_online_cpu(unsigned int cpu) /* * Unbound kthreads without preferred affinity are already affine * to housekeeping, whether those CPUs are online or not. So no need - * to handle newly online CPUs for them. + * to handle newly online CPUs for them. However housekeeping changes + * have to be applied. * * But kthreads with a preferred affinity or node are different: * if none of their preferred CPUs are online and part of @@ -937,7 +931,7 @@ static int kthreads_online_cpu(unsigned int cpu) * But as soon as one of their preferred CPU becomes online, they must * be affine to them. */ - if (k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { + if (force || k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { kthread_fetch_affinity(k, affinity); set_cpus_allowed_ptr(k->task, affinity); } @@ -948,6 +942,33 @@ static int kthreads_online_cpu(unsigned int cpu) return ret; } =20 +/** + * kthreads_update_housekeeping - Update kthreads affinity on cpuset change + * + * When cpuset changes a partition type to/from "isolated" or updates rela= ted + * cpumasks, propagate the housekeeping cpumask change to preferred kthrea= ds + * affinity. + * + * Returns 0 if successful, -ENOMEM if temporary mask couldn't + * be allocated or -EINVAL in case of internal error. + */ +int kthreads_update_housekeeping(void) +{ + return kthreads_update_affinity(true); +} + +/* + * Re-affine kthreads according to their preferences + * and the newly online CPU. The CPU down part is handled + * by select_fallback_rq() which default re-affines to + * housekeepers from other nodes in case the preferred + * affinity doesn't apply anymore. + */ +static int kthreads_online_cpu(unsigned int cpu) +{ + return kthreads_update_affinity(false); +} + static int kthreads_init(void) { return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index bad5fdf7e991..bc77c87e93ac 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -151,6 +151,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); err =3D workqueue_unbound_housekeeping_update(housekeeping_cpumask(type)); + WARN_ON_ONCE(err < 0); + err =3D kthreads_update_housekeeping(); =20 kfree(old); =20 --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8507534CFA0; Wed, 5 Nov 2025 21:07:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376871; cv=none; b=ZU7uaop61ybNN8+98BJjkdHCl4eUBeah2DGFUHhyMwHF4kqb8pniMPcFiAR7sFVLrwA2CogJHQ1/dvJ67zUVIDQqieL/H/6IRqV5BVFaz8mrZDNFYK2B29H99mJsvt1Mhgo7eHkptsnicdFpPXf1e2eo1KCwmque4/hGRA9j490= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376871; c=relaxed/simple; bh=ZcOaIdyFvlxFsz4x9bV8dX/urbiaQ2dIYf+IOrGWyKQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VqBjbZmJshrwiMfQX17c5j9yxvAZIWUTMbna1xJ+Nxj34zc3m/83NfisG6sud5dW9WAiGAA6yJjtcmskTdggs/Zwv7av4I2fApi3uLdEpVov8TQ2ugUU7A78xdPdnNUCD18Xn0Nxl1ovwF2t+ad84aUaagCsnqnkhItI08sc97Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uqBh1peA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uqBh1peA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BEAB4C4CEF5; Wed, 5 Nov 2025 21:07:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376871; bh=ZcOaIdyFvlxFsz4x9bV8dX/urbiaQ2dIYf+IOrGWyKQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uqBh1peA3OwqS5304aC8e215GChgjNQ+xz8WZIQjvhmtUS6f7nqh/RdQVeNQ2M1h0 N9H3k5WiLHpEb6YnuD/MWiQNlkqvtuYe7W0HDhMHx5X8eNFG6DBGHNKHlJoU2BaZ3M rY4SVKr9xZY+z9hsVZG9KG/MPb2rofpyd5Vsybfo7AU5hqZEdZTrlKKYfGBaIM7drb 2RHucdD91Kljr+Sb2WKdMpyTxwAtuFcHF2KllXdzXz5owrbsFGBYFYh+LObYkRbxKp 6VQgFGT1HY05dsL5fIdD8y7aTR9kXshC2UmQ59+mAygTcfny0kDhWu6JYvJ0+oQUt1 C3W1zYUxNI9Ng== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 29/31] kthread: Comment on the purpose and placement of kthread_affine_node() call Date: Wed, 5 Nov 2025 22:03:45 +0100 Message-ID: <20251105210348.35256-30-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It may not appear obvious why kthread_affine_node() is not called before the kthread creation completion instead of after the first wake-up. The reason is that kthread_affine_node() applies a default affinity behaviour that only takes place if no affinity preference have already been passed by the kthread creation call site. Add a comment to clarify that. Reported-by: Peter Zijlstra Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index f535d4e66a71..e38ff9b943cf 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -453,6 +453,10 @@ static int kthread(void *_create) =20 self->started =3D 1; =20 + /* + * Apply default node affinity if no call to kthread_bind[_mask]() nor + * kthread_affine_preferred() was issued before the first wake-up. + */ if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); =20 --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01A4C34CFA0; Wed, 5 Nov 2025 21:07:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376879; cv=none; b=MVPCGXFVE0zq10vZpxWs1yT12OtQABD4iXBaIgkypASFsessQl/rTosQRvk9MRkVE8Ot1y1LouTmXXMY6nFP7nt1X/ovghZ+8GKf8/0Yh7DgtTIP80rFyJOdjPDN9ypfaF5fPXGsdiEbnqBhQIrUPwzeqYmVgQUkUJeHzYNZJmI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376879; c=relaxed/simple; bh=VBowzwkKI7sQO6hRH/44BMa4PxA99ppD75DvlCpqKuI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DeJD9r66bVYJ4Uv3X/t4ma4JQ36NghiLwHgzBBZ4r+WdnYYYBKySISZuPQ63ZglPg24gCsRRTUNfZmXiOsIRyXgxSeItDToVCodNvEz2AWhYIL+QLaYdXgQimKAc5CJ5sVf43HppNv4Od1XDx4BppU9DHvKpuSH1dTiyKSfNK50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HCxWOTZX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HCxWOTZX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9B1E3C116B1; Wed, 5 Nov 2025 21:07:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376878; bh=VBowzwkKI7sQO6hRH/44BMa4PxA99ppD75DvlCpqKuI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HCxWOTZXSfmgLOy+8qHyUSsJQbRRrEggtHXYLIpXy1swuMrv2SEYUM8cLQW7kSc4H 0Ka2bto9uN0BXbhyNM3r1gLJVYwW3t2rKAXPeI6d3ZnnTZoRFUTmkBveHxO+F8uRmo AXkHlKN4LkLllSCXYkJiBOKHWrRf6CXXYtYu1l6M03U6EEvLzqrMyOOfnSaNsd0vRj h2TmHpzXfc20tUeNa2fzrHLlFE/ggaDYNk1I4h5CwebPAdOaP0ppHyvo465ef4OXCB CyfaCDTcbrZRKcEQW+8KvinCbmGda/odIiBswEOY/el7rbNcbJ1X/NP4oGu/CK60DH oTdnWeUTkDOGg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 30/31] kthread: Document kthread_affine_preferred() Date: Wed, 5 Nov 2025 22:03:46 +0100 Message-ID: <20251105210348.35256-31-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The documentation of this new API has been overlooked during its introduction. Fill the gap. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index e38ff9b943cf..4d0d5dccc9d8 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -858,6 +858,18 @@ int kthreadd(void *unused) return 0; } =20 +/** + * kthread_affine_preferred - Define a kthread's preferred affinity + * @p: thread created by kthread_create(). + * @mask: preferred mask of CPUs (might not be online, must be possible) f= or @p + * to run on. + * + * Similar to kthread_bind_mask() except that the affinity is not a requir= ement + * but rather a preference that can be constrained by CPU isolation or CPU= hotplug. + * Must be called before the first wakeup of the kthread. + * + * Returns 0 if the affinity has been applied. + */ int kthread_affine_preferred(struct task_struct *p, const struct cpumask *= mask) { struct kthread *kthread =3D to_kthread(p); --=20 2.51.0 From nobody Sun Dec 14 22:11:54 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 896AB34CFA0; Wed, 5 Nov 2025 21:08:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376887; cv=none; b=dnpYLT+Y0DMo8cBOe8MnGNqxNRuz+2gUP94wnpOmpl9Z4ijL+hwKG5pI86hZKToX0G8YEDK61+5BKiLBAcgCDRWK8xU1XtkLKU6mvl8wW3UUQPivlveN2sYBb9eVajreL/gjlZIoA2hFTcrMX09U+Mb4po/YwniXPCXT/0IKU9A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762376887; c=relaxed/simple; bh=Jrc2SP9zuO/q8lVKXeMSZQ2TjjYp5fXU9+UkI2oxxh8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OL18o24modziuZcfHgobmkthlJxB+hoyjQjMk9E0HWgAm3RXNBJXrY19CcvTLmpZ3ODUYEi8wpV9AhIDXUF6x1LlLrWibiinUbArR8vDBHaICc4EMWmSUUw6DjliZ93MA3yN4kUQC73O+jT0JX4E2Y66CGYmkxCdXPPVWNjQW6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MvMmy9z/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MvMmy9z/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4D4C2C116C6; Wed, 5 Nov 2025 21:07:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1762376887; bh=Jrc2SP9zuO/q8lVKXeMSZQ2TjjYp5fXU9+UkI2oxxh8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MvMmy9z/Tgk53O6EbE+zG2oKCMSZwHnfkHuPoSeCh479KVbDuJNzsb/Q95ddbdX1O YQTf2wfL/dngH/FTqEV6nyuyONeRGG6HmXIoUpQYnvRwoBce1dGez5Rnclx/R54AJG jF46uVQGSBmc+QMKekLAz3yISaEfO/lEUBh/VXH4j9x5/+pRiLN7gYF1+fWkOKQ36F j1wBHxPHnmeF1FuhdLWetNBKgWb0hoEHcy6zOXnoMX7DeUEMTwVgqvZecZg7n+SX65 jdNVvmiqi+lM6qGacp0TkqxYpP3U9TZXPPfxP2T1eiIo+XV5uRXCcGxIxzrr4Zr+Kh OUYcrvQZvJ2Zg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH 31/31] doc: Add housekeeping documentation Date: Wed, 5 Nov 2025 22:03:47 +0100 Message-ID: <20251105210348.35256-32-frederic@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251105210348.35256-1-frederic@kernel.org> References: <20251105210348.35256-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Signed-off-by: Frederic Weisbecker --- Documentation/cpu_isolation/housekeeping.rst | 111 +++++++++++++++++++ 1 file changed, 111 insertions(+) create mode 100644 Documentation/cpu_isolation/housekeeping.rst diff --git a/Documentation/cpu_isolation/housekeeping.rst b/Documentation/c= pu_isolation/housekeeping.rst new file mode 100644 index 000000000000..e5417302774c --- /dev/null +++ b/Documentation/cpu_isolation/housekeeping.rst @@ -0,0 +1,111 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Housekeeping +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + + +CPU Isolation moves away kernel work that may otherwise run on any CPU. +The purpose of its related features is to reduce the OS jitter that some +extreme workloads can't stand, such as in some DPDK usecases. + +The kernel work moved away by CPU isolation is commonly described as +"housekeeping" because it includes ground work that performs cleanups, +statistics maintainance and actions relying on them, memory release, +various deferrals etc... + +Sometimes housekeeping is just some unbound work (unbound workqueues, +unbound timers, ...) that gets easily assigned to non-isolated CPUs. +But sometimes housekeeping is tied to a specific CPU and requires +elaborated tricks to be offloaded to non-isolated CPUs (RCU_NOCB, remote +scheduler tick, etc...). + +Thus, a housekeeping CPU can be considered as the reverse of an isolated +CPU. It is simply a CPU that can execute housekeeping work. There must +always be at least one online housekeeping CPU at any time. The CPUs that +are not isolated are automatically assigned as housekeeping. + +Housekeeping is currently divided in four features described +by the ``enum hk_type type``: + +1. HK_TYPE_DOMAIN matches the work moved away by scheduler domain + isolation performed through ``isolcpus=3Ddomain`` boot parameter or + isolated cpuset partitions in cgroup v2. This includes scheduler + load balancing, unbound workqueues and timers. + +2. HK_TYPE_KERNEL_NOISE matches the work moved away by tick isolation + performed through ``nohz_full=3D`` or ``isolcpus=3Dnohz`` boot + parameters. This includes remote scheduler tick, vmstat and lockup + watchdog. + +3. HK_TYPE_MANAGED_IRQ matches the IRQ handlers moved away by managed + IRQ isolation performed through ``isolcpus=3Dmanaged_irq``. + +4. HK_TYPE_DOMAIN_BOOT matches the work moved away by scheduler domain + isolation performed through ``isolcpus=3Ddomain`` only. It is similar + to HK_TYPE_DOMAIN except it ignores the isolation performed by + cpusets. + + +Housekeeping cpumasks +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D + +Housekeeping cpumasks include the CPUs that can execute the work moved +away by the matching isolation feature. These cpumasks are returned by +the following function:: + + const struct cpumask *housekeeping_cpumask(enum hk_type type) + +By default, if neither ``nohz_full=3D``, nor ``isolcpus``, nor cpuset's +isolated partitions are used, which covers most usecases, this function +returns the cpu_possible_mask. + +Otherwise the function returns the cpumask complement of the isolation +feature. For example: + +With isolcpus=3Ddomain,7 the following will return a mask with all possible +CPUs except 7:: + + housekeeping_cpumask(HK_TYPE_DOMAIN) + +Similarly with nohz_full=3D5,6 the following will return a mask with all +possible CPUs except 5,6:: + + housekeeping_cpumask(HK_TYPE_KERNEL_NOISE) + + +Synchronization against cpusets +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D + +Cpuset can modify the HK_TYPE_DOMAIN housekeeping cpumask while creating, +modifying or deleting an isolated partition. + +The users of HK_TYPE_DOMAIN cpumask must then make sure to synchronize +properly against cpuset in order to make sure that: + +1. The cpumask snapshot stays coherent. + +2. No housekeeping work is queued on a newly made isolated CPU. + +3. Pending housekeeping work that was queued to a non isolated + CPU which just turned isolated through cpuset must be flushed + before the related created/modified isolated partition is made + available to userspace. + +This synchronization is maintained by an RCU based scheme. The cpuset upda= te +side waits for an RCU grace period after updating the HK_TYPE_DOMAIN +cpumask and before flushing pending works. On the read side, care must be +taken to gather the housekeeping target election and the work enqueue with= in +the same RCU read side critical section. + +A typical layout example would look like this on the update side +(``housekeeping_update()``):: + + rcu_assign_pointer(housekeeping_cpumasks[type], trial); + synchronize_rcu(); + flush_workqueue(example_workqueue); + +And then on the read side:: + + rcu_read_lock(); + cpu =3D housekeeping_any_cpu(HK_TYPE_DOMAIN); + queue_work_on(cpu, example_workqueue, work); + rcu_read_unlock(); --=20 2.51.0