From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41E871E9B0C for ; Thu, 10 Apr 2025 15:23:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298619; cv=none; b=eHa2w3Apjle4XH2hFrNZ6SubHLpGHkm9mrw/Nr0YuwESRT/4b9HtGrAnHfCThjD//LmelDGoT+yYdK6kbWrP23OKNbH5+CzRVY2Ku6MPQXPp156bqiSIw2q3SQeQlrJx3IYehnnjQRVhh7voi7p4mSYnIKpM76FeP/YvBtDF94c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298619; c=relaxed/simple; bh=KEyxKF7bbPMo0JctDd19rU2CfoDK6otXeM0XfrzjtAM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JKjRI6ZeSPeXVDOGH1T4PzrfVnPpQWobyp3NQlSRxzFv0pxkCuGQCfT5UiMIyw3QdpiSI2u08PMclQV/YbaWjPDZaI56B1KEHvK1eyCSyFXLTSZX+KTxYXjrCxu52tECBzpHDhfliagynKEnMI7cfsTbIXjBf4AaHYOoABLuMbo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VcFd9rDr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VcFd9rDr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 39DE7C4CEEA; Thu, 10 Apr 2025 15:23:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298618; bh=KEyxKF7bbPMo0JctDd19rU2CfoDK6otXeM0XfrzjtAM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VcFd9rDrZtAUzAuQM3q9opD0OHZq2LCU9tenwim6J0K/lltLBnIDeFGVU8YfHDh7A Hm7aPwIj4k/xM3CPD4enYLOZx3UZQ7YGlxCX1k2PRf+9fLHukIH0SxUSTXVyMFkh5d nwJGQYr2CFJ5ujH6eOD59BzLkiZZCfyNXlll2+F/3/DBdjspbzr2oHVoeKN+pFtXRk kggUN5dYSJqUZ+dEUXYtL7buoPUSGhrqnmV22Ws0sYtugj5lCEMsbIt6dVZPV1BDCR DcpCXRb4zNA27PVsgGUeBQq9Jk3VTxd6cpzaYLOWQ5eZCYvZFsRXd0ktc78Lnqw6OX gkSrGlZLP1guA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 1/6] task_work: Provide means to check if a work is queued Date: Thu, 10 Apr 2025 17:23:22 +0200 Message-ID: <20250410152327.24504-2-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Some task work users implement their own ways to know if a callback is already queued on the current task while fiddling with the callback head internals. Provide instead a consolidated API to serve this very purpose. Reviewed-by: Oleg Nesterov Reviewed-by: Valentin Schneider Signed-off-by: Frederic Weisbecker --- include/linux/task_work.h | 12 ++++++++++++ kernel/task_work.c | 9 +++++++-- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/include/linux/task_work.h b/include/linux/task_work.h index 0646804860ff..31caf12c1313 100644 --- a/include/linux/task_work.h +++ b/include/linux/task_work.h @@ -5,12 +5,15 @@ #include #include =20 +#define TASK_WORK_DEQUEUED ((void *) -1UL) + typedef void (*task_work_func_t)(struct callback_head *); =20 static inline void init_task_work(struct callback_head *twork, task_work_func_t func) { twork->func =3D func; + twork->next =3D TASK_WORK_DEQUEUED; } =20 enum task_work_notify_mode { @@ -26,6 +29,15 @@ static inline bool task_work_pending(struct task_struct = *task) return READ_ONCE(task->task_works); } =20 +/* + * Check if a work is queued. Beware: this is inherently racy if the work = can + * be queued elsewhere than the current task. + */ +static inline bool task_work_queued(struct callback_head *twork) +{ + return twork->next !=3D TASK_WORK_DEQUEUED; +} + int task_work_add(struct task_struct *task, struct callback_head *twork, enum task_work_notify_mode mode); =20 diff --git a/kernel/task_work.c b/kernel/task_work.c index d1efec571a4a..56718cb824d9 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -67,8 +67,10 @@ int task_work_add(struct task_struct *task, struct callb= ack_head *work, =20 head =3D READ_ONCE(task->task_works); do { - if (unlikely(head =3D=3D &work_exited)) + if (unlikely(head =3D=3D &work_exited)) { + work->next =3D TASK_WORK_DEQUEUED; return -ESRCH; + } work->next =3D head; } while (!try_cmpxchg(&task->task_works, &head, work)); =20 @@ -129,8 +131,10 @@ task_work_cancel_match(struct task_struct *task, if (!match(work, data)) { pprev =3D &work->next; work =3D READ_ONCE(*pprev); - } else if (try_cmpxchg(pprev, &work, work->next)) + } else if (try_cmpxchg(pprev, &work, work->next)) { + work->next =3D TASK_WORK_DEQUEUED; break; + } } raw_spin_unlock_irqrestore(&task->pi_lock, flags); =20 @@ -224,6 +228,7 @@ void task_work_run(void) =20 do { next =3D work->next; + work->next =3D TASK_WORK_DEQUEUED; work->func(work); work =3D next; cond_resched(); --=20 2.48.1 From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0F4B1E5B94 for ; Thu, 10 Apr 2025 15:23:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298621; cv=none; b=RX6BnYNb9NedGlDjIPE+dNzyw6a0XUBhP6SCJhSRLmfXb08KBybVhpSzLr9eR5ptJFWskrd5fXdDnzHghCPqgZ98Env+CQP29qUrS1VQtHftvd8/mxQ7uSR7Oz5hTrdYBU1qLCzkGn3i5cvQdKSHtgTINwE1p0J98bkTAO2F71g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298621; c=relaxed/simple; bh=ljh+x804fC62Zjq89Q+GJgD75j99Lk55WXN2zidEDas=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=acCkSEAr2ydQnaJQJAZjaPcORFJ0MkpDejA5ZteYBNUMw9XT8z1BqbX4QwSDj3aPf4hsIbt5Dz23o7K+goqOHBUk5PVM8va/zmnY7q1UR4S8IE8LFw9uK8yeHDrQZP2vjup+UL8JbebaQX9Zr7aJWJVix0Mm8wS3T7oxO9UoAXs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YLFodFHu; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YLFodFHu" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2711FC4CEEB; Thu, 10 Apr 2025 15:23:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298621; bh=ljh+x804fC62Zjq89Q+GJgD75j99Lk55WXN2zidEDas=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YLFodFHuby/VWTADh1XTREQPgLF8zm8Zsf4Qepn6NufyHBWQZcyEDchrBNZm7HdS5 VitZLkZM5ehaBZssJZ1KhFAJlU6IuBMIvKrpfnS6zaG4zzl6rU4XFAvjsZSWxY0CuN UmwcCq5htpVOfdJDAV66kmGRajrA8nK7cQ+4Y/B10AJrVJEQAONwwdG3cCATQ/RA3I dHrNP+Ykjx6IhVG9lwZyk3FiBWZiNgCiE+ofc0Pxi+YApfamUxSRWA/+fVkqyIYDef yB/Ls3829yJuQy0adG4giYgtFoWsr2fV1EqyFpJpFAMFY8tI3S32euaFPpsWP/j7Iw Jhhmi5nhfZ8KA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 2/6] sched/fair: Use task_work_queued() on numa_work Date: Thu, 10 Apr 2025 17:23:23 +0200 Message-ID: <20250410152327.24504-3-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the ad-hoc implementation of task_work_queued(). Reviewed-by: Oleg Nesterov Reviewed-by: Valentin Schneider Signed-off-by: Frederic Weisbecker --- kernel/sched/fair.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e43993a4e580..c6ffa2fdbbd6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3317,7 +3317,6 @@ static void task_numa_work(struct callback_head *work) =20 WARN_ON_ONCE(p !=3D container_of(work, struct task_struct, numa_work)); =20 - work->next =3D work; /* * Who cares about NUMA placement when they're dying. * @@ -3565,8 +3564,6 @@ void init_numa_balancing(unsigned long clone_flags, s= truct task_struct *p) p->numa_scan_seq =3D mm ? mm->numa_scan_seq : 0; p->numa_scan_period =3D sysctl_numa_balancing_scan_delay; p->numa_migrate_retry =3D 0; - /* Protect against double add, see task_tick_numa and task_numa_work */ - p->numa_work.next =3D &p->numa_work; p->numa_faults =3D NULL; p->numa_pages_migrated =3D 0; p->total_numa_faults =3D 0; @@ -3607,7 +3604,7 @@ static void task_tick_numa(struct rq *rq, struct task= _struct *curr) /* * We don't care about NUMA placement if we don't have memory. */ - if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) || work->next = !=3D work) + if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) || task_work_q= ueued(work)) return; =20 /* --=20 2.48.1 From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 282441E5B94 for ; Thu, 10 Apr 2025 15:23:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298625; cv=none; b=GlxLVUagx/HGk93sSN4z+VMBr3crXoiglqXJUlzQuk2AuMuRqB6CkXfI2qnH/zAPi1VIFvfjHz0gjr++7hUTt0CeaHB2s4vBYdpkUkK41mUz6U60WICkWpbsIaZ9iQvDu9RvI4rYxrUipcAzc8pvlxnRJmmpbUJmA1u6mgsP4dY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298625; c=relaxed/simple; bh=Wjj+RxTAFtK3Vg/VxjQp1/LpNlEkwCHswxd2NjU6Evo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lWRtaqJQrEuQaC0jITlMADb0mQ6to5XgE5V3m3ByNsBHhSOZpQS/A/JGKFqLnsBK1R+2KvYH9MdAnt+w1JcbY+957ZTLLlM1JIjrD3l8A+XKEkH3v5DXELsXdT1HZe5TYI5v4xFlpN23xTxgbSv/1zA0uv+YVyQxLIhy8ZTN51w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=M211yw1l; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="M211yw1l" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 153A7C4CEE8; Thu, 10 Apr 2025 15:23:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298624; bh=Wjj+RxTAFtK3Vg/VxjQp1/LpNlEkwCHswxd2NjU6Evo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M211yw1ldc3Hf9hKFJ64zkqx61cUGKq0dplVDOpTKCD36p3MrbzjMq6sR/YZ6iUK+ EQ0XiCyR+Ri2pWUNqecLe++qdTdhpNMETeJOUWiz+2wqAktpm2QjYeRUzTdBsKXuPy nTOVOtPqDYTEnxcl3Mwjj2zFwLT/Pu1lqyhtMI+Z0wE547jpwKordX1cpo5s7rt1oi vvDU7kx4xJIFYohPCbHXgjkiQ/EwmroNc15hngyEU6WDaqoWll4Q2hjeJatBQvxznu XHfeXIJhHvgPtnmlK21JUENwFkLsOzHPbVY1CQzDPi+b8r/MFhXtwmbHSlOaS6psxq OtSip1JmsgTPQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 3/6] sched: Use task_work_queued() on cid_work Date: Thu, 10 Apr 2025 17:23:24 +0200 Message-ID: <20250410152327.24504-4-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the ad-hoc implementation of task_work_queued() Reviewed-by: Oleg Nesterov Signed-off-by: Frederic Weisbecker --- kernel/sched/core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cfaca3040b2f..add41254b6e5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10576,7 +10576,6 @@ static void task_mm_cid_work(struct callback_head *= work) =20 WARN_ON_ONCE(t !=3D container_of(work, struct task_struct, cid_work)); =20 - work->next =3D work; /* Prevent double-add */ if (t->flags & PF_EXITING) return; mm =3D t->mm; @@ -10620,7 +10619,6 @@ void init_sched_mm_cid(struct task_struct *t) if (mm_users =3D=3D 1) mm->mm_cid_next_scan =3D jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY); } - t->cid_work.next =3D &t->cid_work; /* Protect against double add */ init_task_work(&t->cid_work, task_mm_cid_work); } =20 @@ -10629,8 +10627,7 @@ void task_tick_mm_cid(struct rq *rq, struct task_st= ruct *curr) struct callback_head *work =3D &curr->cid_work; unsigned long now =3D jiffies; =20 - if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) || - work->next !=3D work) + if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) || task_work_q= ueued(work)) return; if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan))) return; --=20 2.48.1 From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3423B1E5B94 for ; Thu, 10 Apr 2025 15:23:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298629; cv=none; b=RAULASzp9oSDq9xTNj/e0VodwyebFkaIvUm0O6dKp2D+xocL7y1dTztt7+GUVSXe8rOEkj1mE9HHk9sQ98OiCqTUkGsvLXjMhyut9AFgKA6uJtFvRzvG+zrTiqB4fiT+g/mp3M1CrjlzA/rWjeFtHJCUbLgwMQgR7R470bzWTO8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298629; c=relaxed/simple; bh=l3VO6WBv+ArOF03U7NJi6+am3DCDRNiD9Zb95QE1QjQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JMoWssqCg136GKRsezgL8MNOAsrXQtqAiuVYpAfAWZZv0YxgMCFH6GXIyMK4k9L+Wqi5Wk+8bM588WLyCHcrwGvWcKPCT29cQwcYAZZPQjCpyrHC/5c8jhiZnZQZx9q2QhuvhQdSxJk0/2eWPrVNOa7mRkygfY+tDWsnn1O09Nc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=L4Pb8QdK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="L4Pb8QdK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0E8A9C4CEEA; Thu, 10 Apr 2025 15:23:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298627; bh=l3VO6WBv+ArOF03U7NJi6+am3DCDRNiD9Zb95QE1QjQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=L4Pb8QdKdNYxjnWQ8Wm5pszfSbTM8zEO1gP/6yA6X78fuFlRn4lIex+ze7UlVfXA7 qFgVCuDW6Yd54isboTwax15Q2PnxUL9J0A1eI9qDHz/4vP60nxPVZ+Zka5XL/IStBp B7wk55YGzwkeFA82PqylJJIibC7Q6LcPAhzWEuTv76nAxwVhZTMGKylTh5HU6Lg+KU QquEVfV5030H3BnuGUGr1Pgr/Ttsg+6sycLunvRjNvPb7CdQi/yTAIwF0LZMf/IzU7 PKvL/MYNGGTNGqUwEHobnchX/rlOKxDr5aNU6ZpWR9QAGqJXAlrBFQ8DXEjRMPsYem 614Yh4/tzyiEg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 4/6] tick/nohz: Move nohz_full related fields out of hot task struct's places Date: Thu, 10 Apr 2025 17:23:25 +0200 Message-ID: <20250410152327.24504-5-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nohz_full is a feature that only fits into rare and very corner cases. Yet distros enable it by default and therefore the related fields are always reserved in the task struct. Those task fields are stored in the middle of cacheline hot places such as cputime accounting and context switch counting, which doesn't make any sense for a feature that is disabled most of the time. Move the nohz_full storage to colder places. Signed-off-by: Frederic Weisbecker --- include/linux/sched.h | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index f96ac1982893..b5ce76db6d75 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1110,13 +1110,7 @@ struct task_struct { #endif u64 gtime; struct prev_cputime prev_cputime; -#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN - struct vtime vtime; -#endif =20 -#ifdef CONFIG_NO_HZ_FULL - atomic_t tick_dep_mask; -#endif /* Context switch counts: */ unsigned long nvcsw; unsigned long nivcsw; @@ -1438,6 +1432,14 @@ struct task_struct { struct task_delay_info *delays; #endif =20 +#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN + struct vtime vtime; +#endif + +#ifdef CONFIG_NO_HZ_FULL + atomic_t tick_dep_mask; +#endif + #ifdef CONFIG_FAULT_INJECTION int make_it_fail; unsigned int fail_nth; --=20 2.48.1 From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B77BF189F5C for ; Thu, 10 Apr 2025 15:23:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298630; cv=none; b=HGwfjU+Txa8hFwo3h5rrX/ML+g6kNo/o9FH6Fg6HiJxku3o2+o2h6tRw/gliCs29hV6X5AUzao5aU/b9b4c+T3MmuoIBQVUfWCMi+YMVlFBZJnJ4vTlquCE5n9zTZV1RFewz6sgPZQU95fU8+M32db+SI7hNN18asBt7ilPaUG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298630; c=relaxed/simple; bh=uy0c2oFcabCgEI0cFKHtoYjpr7+77MB0L7hpksyKfHQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FY129U9vcKtit4xaQh00XN3FihaMMsJPT65G1mZ/KFZb9wFaz44mGlxFZEuJVtILZRbkoDHlb2nPAm2OchadOmvrjCEv5RXgjoPqIHUyfzdMO6SX+I4kCEHe+CEthKCo2biJB2Em6e3/bpr2RN+vh+1tDTqSmBUZZyPhhpQuCiY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cBZB64Tv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cBZB64Tv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 272C5C4CEDD; Thu, 10 Apr 2025 15:23:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298630; bh=uy0c2oFcabCgEI0cFKHtoYjpr7+77MB0L7hpksyKfHQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cBZB64TvnjyyfMbQs693eWSSDRKWnwPO1MetMG7eHb3iXXOJZtEfa824H25luTplL GepSJ+c/iNoAm6Dnc5W5ClEx7Abyoyp19dBkJFK4YXwJh1mgflHpvVwkluUr3NpBAr 9cVqcWIE1c2Iog8Dye5Mt999H4pFNFd7mkknWxblbA80ecio3iXYjHs3erzd5czkJh A64lYdbYyb/3lv6cN4Pn60IC0KRcOQmo+6JNXZ6s8eLcmUQC45twqa4V4O7zgqEk0g ekInDXHd4YWoMTGkyxGdhaV2Z4Rpurn9zxmBCO/9fQ0eehosPDGlOZosiWqymcqNXm uDROZVeA9Lrtw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 5/6] sched/isolation: Introduce isolated task work Date: Thu, 10 Apr 2025 17:23:26 +0200 Message-ID: <20250410152327.24504-6-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Some asynchronous kernel work may be pending upon resume to userspace and execute later on. On isolated workload this becomes problematic once the process is done with preparatory work involving syscalls and wants to run in userspace without being interrupted. Provide an infrastructure to queue a work to be executed from the current isolated task context right before resuming to userspace. This goes with the assumption that isolated tasks are pinned to a single nohz_full CPU. Signed-off-by: Frederic Weisbecker --- include/linux/sched.h | 1 + include/linux/sched/isolation.h | 17 +++++++++++++++++ kernel/sched/core.c | 1 + kernel/sched/isolation.c | 31 +++++++++++++++++++++++++++++++ kernel/sched/sched.h | 1 + 5 files changed, 51 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index b5ce76db6d75..4d764eb96e3e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1437,6 +1437,7 @@ struct task_struct { #endif =20 #ifdef CONFIG_NO_HZ_FULL + struct callback_head nohz_full_work; atomic_t tick_dep_mask; #endif =20 diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index d8501f4709b5..74da4324b984 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -77,4 +77,21 @@ static inline bool cpu_is_isolated(int cpu) cpuset_cpu_is_isolated(cpu); } =20 +#if defined(CONFIG_NO_HZ_FULL) +extern int __isolated_task_work_queue(void); + +static inline int isolated_task_work_queue(void) +{ + if (!housekeeping_cpu(raw_smp_processor_id(), HK_TYPE_KERNEL_NOISE)) + return -ENOTSUPP; + + return __isolated_task_work_queue(); +} + +extern void isolated_task_work_init(struct task_struct *tsk); +#else +static inline int isolated_task_work_queue(void) { return -ENOTSUPP; } +static inline void isolated_task_work_init(struct task_struct *tsk) { } +#endif /* CONFIG_NO_HZ_FULL */ + #endif /* _LINUX_SCHED_ISOLATION_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index add41254b6e5..c8b8b61ac3a6 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4524,6 +4524,7 @@ static void __sched_fork(unsigned long clone_flags, s= truct task_struct *p) p->migration_pending =3D NULL; #endif init_sched_mm_cid(p); + isolated_task_work_init(p); } =20 DEFINE_STATIC_KEY_FALSE(sched_numa_balancing); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 81bc8b329ef1..e246287de9fa 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -249,3 +249,34 @@ static int __init housekeeping_isolcpus_setup(char *st= r) return housekeeping_setup(str, flags); } __setup("isolcpus=3D", housekeeping_isolcpus_setup); + +#if defined(CONFIG_NO_HZ_FULL) +static void isolated_task_work(struct callback_head *head) +{ +} + +int __isolated_task_work_queue(void) +{ + unsigned long flags; + int ret; + + if (current->flags & PF_KTHREAD) + return -EINVAL; + + local_irq_save(flags); + if (task_work_queued(¤t->nohz_full_work)) { + ret =3D 0; + goto out; + } + + ret =3D task_work_add(current, ¤t->nohz_full_work, TWA_RESUME); +out: + local_irq_restore(flags); + return ret; +} + +void isolated_task_work_init(struct task_struct *tsk) +{ + init_task_work(&tsk->nohz_full_work, isolated_task_work); +} +#endif /* CONFIG_NO_HZ_FULL */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 47972f34ea70..e7dc4ae5ccc1 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -60,6 +60,7 @@ #include #include #include +#include #include #include #include --=20 2.48.1 From nobody Sun Feb 8 13:08:51 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F06D1EA7D2 for ; Thu, 10 Apr 2025 15:23:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298634; cv=none; b=GF8ozlGaYw8tJD3yxxn7Rv4wvnfQjCuS7zDCd8yA95+dU/BFUdgnK9pPcfkMLOO5UXVfFXmYoni4Ie+1GVAmWEPApeprdwBNg7Iunfpx1+ptV8eV9qKbrZYYskCFv52aemFbRXONhb8plt78MAsZfekRMSwyn1XRIX6ZiFKESSk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744298634; c=relaxed/simple; bh=mUaotrL5SLp9Dul2a/IS6zLfFa040Gs6YVO08bgNqSk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JvQXZmHLpqKx+HiFbMd6Md0FfEaFAjOEoCPhZAvDEWAA2KJWb9f7a088UWcMjk1cP7y+y8Y+1Py6Unu9kIJh9gAQ94O8idZE8rDRoxoSAkAHB2/m5l3SmQJH4NUOW0FeMDgtnxCEym16/O/wWrhFit5IGRQO2UD8mgZAjQUcEuE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RW9KHCid; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RW9KHCid" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09497C4CEDD; Thu, 10 Apr 2025 15:23:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744298633; bh=mUaotrL5SLp9Dul2a/IS6zLfFa040Gs6YVO08bgNqSk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RW9KHCid5MwDdaD5X2IFz/T1e97biRC12D1QXwfyy+s3NSH1Ro05JZ5YD8Q1ZNW1o 0QSePthhlFiAJuzXn/LHvY1akEowaX2Oqsjb4xYhFfMqZDTqwQAUtaSGTZ2SRfCI4h fILGR7GF0FJzviUX7xNzYfB7XUeroyF5fobplI/6r2NK4LigzCM8fYo7d0eAEnYxS4 UhI6YxhP9u3MNv0jT7iukB/zfpHA4u0QsTaY13s2Z9LW1f3Z5EGRw0ZvSvrPrhJz2A FUHr1+4q/KXsIwH4j7NZ/S2UP+TflYHDZsct17x9X5U5GZ2tzz2ge/zTyl4RTtRvNv 7DcXOyLdgeM2Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 6/6] mm: Drain LRUs upon resume to userspace on nohz_full CPUs Date: Thu, 10 Apr 2025 17:23:27 +0200 Message-ID: <20250410152327.24504-7-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250410152327.24504-1-frederic@kernel.org> References: <20250410152327.24504-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" LRU batching can be source of disturbances for isolated workloads running in the userspace because it requires kernel worker to handle that and that would preempt the said task. The primary source for such disruption would be __lru_add_drain_all which could be triggered from non-isolated CPUs. Why would an isolated CPU have anything on the pcp cache? Many syscalls allocate pages that might end there. A typical and unavoidable one would be fork/exec leaving pages on the cache behind just waiting for somebody to drain. Address the problem by noting a batch has been added to the cache and schedule draining upon return to userspace so the work is done while the syscall is still executing and there are no suprises while the task runs in the userspace where it doesn't want to be preempted. Signed-off-by: Frederic Weisbecker --- include/linux/pagevec.h | 18 ++---------------- include/linux/swap.h | 1 + kernel/sched/isolation.c | 3 +++ mm/swap.c | 30 +++++++++++++++++++++++++++++- 4 files changed, 35 insertions(+), 17 deletions(-) diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h index 5d3a0cccc6bf..7e647b8df4c7 100644 --- a/include/linux/pagevec.h +++ b/include/linux/pagevec.h @@ -61,22 +61,8 @@ static inline unsigned int folio_batch_space(struct foli= o_batch *fbatch) return PAGEVEC_SIZE - fbatch->nr; } =20 -/** - * folio_batch_add() - Add a folio to a batch. - * @fbatch: The folio batch. - * @folio: The folio to add. - * - * The folio is added to the end of the batch. - * The batch must have previously been initialised using folio_batch_init(= ). - * - * Return: The number of slots still available. - */ -static inline unsigned folio_batch_add(struct folio_batch *fbatch, - struct folio *folio) -{ - fbatch->folios[fbatch->nr++] =3D folio; - return folio_batch_space(fbatch); -} +unsigned int folio_batch_add(struct folio_batch *fbatch, + struct folio *folio); =20 /** * folio_batch_next - Return the next folio to process. diff --git a/include/linux/swap.h b/include/linux/swap.h index db46b25a65ae..8244475c2efe 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -401,6 +401,7 @@ extern void lru_add_drain(void); extern void lru_add_drain_cpu(int cpu); extern void lru_add_drain_cpu_zone(struct zone *zone); extern void lru_add_drain_all(void); +extern void lru_add_and_bh_lrus_drain(void); void folio_deactivate(struct folio *folio); void folio_mark_lazyfree(struct folio *folio); extern void swap_setup(void); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index e246287de9fa..553889f4e9be 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -8,6 +8,8 @@ * */ =20 +#include + enum hk_flags { HK_FLAG_DOMAIN =3D BIT(HK_TYPE_DOMAIN), HK_FLAG_MANAGED_IRQ =3D BIT(HK_TYPE_MANAGED_IRQ), @@ -253,6 +255,7 @@ __setup("isolcpus=3D", housekeeping_isolcpus_setup); #if defined(CONFIG_NO_HZ_FULL) static void isolated_task_work(struct callback_head *head) { + lru_add_and_bh_lrus_drain(); } =20 int __isolated_task_work_queue(void) diff --git a/mm/swap.c b/mm/swap.c index 77b2d5997873..99a1b7b81e86 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -37,6 +37,7 @@ #include #include #include +#include =20 #include "internal.h" =20 @@ -155,6 +156,29 @@ static void lru_add(struct lruvec *lruvec, struct foli= o *folio) trace_mm_lru_insertion(folio); } =20 +/** + * folio_batch_add() - Add a folio to a batch. + * @fbatch: The folio batch. + * @folio: The folio to add. + * + * The folio is added to the end of the batch. + * The batch must have previously been initialised using folio_batch_init(= ). + * + * Return: The number of slots still available. + */ +unsigned int folio_batch_add(struct folio_batch *fbatch, + struct folio *folio) +{ + unsigned int ret; + + fbatch->folios[fbatch->nr++] =3D folio; + ret =3D folio_batch_space(fbatch); + isolated_task_work_queue(); + + return ret; +} +EXPORT_SYMBOL(folio_batch_add); + static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t mov= e_fn) { int i; @@ -738,7 +762,7 @@ void lru_add_drain(void) * the same cpu. It shouldn't be a problem in !SMP case since * the core is only one and the locks will disable preemption. */ -static void lru_add_and_bh_lrus_drain(void) +void lru_add_and_bh_lrus_drain(void) { local_lock(&cpu_fbatches.lock); lru_add_drain_cpu(smp_processor_id()); @@ -864,6 +888,10 @@ static inline void __lru_add_drain_all(bool force_all_= cpus) for_each_online_cpu(cpu) { struct work_struct *work =3D &per_cpu(lru_add_drain_work, cpu); =20 + /* Isolated CPUs handle their cache upon return to userspace */ + if (!housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE)) + continue; + if (cpu_needs_drain(cpu)) { INIT_WORK(work, lru_add_drain_per_cpu); queue_work_on(cpu, mm_percpu_wq, work); --=20 2.48.1