From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 132CA238C06 for ; Tue, 3 Jun 2025 23:37:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; cv=none; b=CzYFxrTG4fLsf6I4KAhBYA1WNrpw5zvaTqjuBMtA/dSTt1vlYMesHw1EerqM/pLNweW/oE9l080o0oZq7bRwycI9LIfW7i/6S7UYLog2s7wzcrWlL84f7mrPa82NqIPyINGWASaD82bAvODxNTqZVwEajyxp0IUmaYh8q5WA2YY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; c=relaxed/simple; bh=a+nPW8Ip1Nz+gcdTaixLNifQPbeU9IfAWcCfFwY5RVA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uBHE+7VRei3Eym+wKpIaAU/EhEBfqLTHSStXx7WNVaip7m+3VOIr3C1IEpLoZNg9C5uHFI7kYYHM/cIKCpRVYgNBQDWgudh3sD3+wE/5+JgMrXrMrqMUpu96Sml5qmcDCNKQqH4ra997B1PEn54DXEV/+3e+zSIzCOAF4zaXcsE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=VOIp8G6b; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="VOIp8G6b" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MN5hO003901; Tue, 3 Jun 2025 23:36:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=6xprP p3NYiZknTpQcWOh24x9izQJbm+YrYkGWNufubs=; b=VOIp8G6bzKw6568ROH3Vp xC7YeMmLs1Xp1w9WTEWionirzjRxll78Qosm8mbp4d/NSGOJsDjuNre26C8V5YT/ 7fBPjxrw5wfuFhQTkUwhPs4yeYolsVilpIxv4qUmxRZ96xBX5r09Slb9QPj3oIF2 qnYAiYXcRDmPYFDyFHoPaX0d8mmgq6HCFvPekuNM+yzZvOumPfIdShoqYZlmVQOW OksXbMR4OruWEHHiAX23TOSB2fDVDfOC8ytUmGwwB9y4pEk1zPukCYStRjYPCRSk 4fAzLlXIRA+rKhOTMTarYHKB0FZjilG3o65jYmKubjCSKpL9hyu+slcVqJlydvx9 w== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8gawfc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:57 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHqt030803; Tue, 3 Jun 2025 23:36:56 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:56 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 1/6] Sched: Scheduler time slice extension Date: Tue, 3 Jun 2025 23:36:49 +0000 Message-ID: <20250603233654.1838967-2-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfXyPpcCE55tUEY OG0sLXtaLn9B8n4u/8Kmu9rdViS335PSBTfrjEZdiyUJC82xKyl+PFmoQitnb7tcd5RzpJku0ep sqatlx5N0prQ/ytCqLF1r6mCtIapTicjD2GVMQR9vwOECRYyi13YVa0lwt5yAGbwQt71aX1bUT/ OgJJtQBgp2N995pd9AmPD5h7Ke9UPOjotBd5qO+8poi3KeXBlJ8Xf9LtHKPxw4MAvorA5I6vTN8 TXeq65OI17b5P62nvAorGs2/huw76waqp8fofQDQ4BX+F6vv1eWKQCMl6AjTiBn/2lydSOnMtLI Cz9cW7L9mMMedr3VuMDkiaHGrpvAv7HiqwmqgeX/6Pkh57l8kjUR+8gmCbw64QoIqUnGfQKI5Od p1hKEcvsInsxr6ao7k7CqhUv0Yv5h5W+jDKrgqveqZ5zLBPHH1oT16ITIXubGvzp4m5wT+6T X-Proofpoint-GUID: KF289qAE7jd7GHpJp9kcGQ7Tm0zF9jJE X-Proofpoint-ORIG-GUID: KF289qAE7jd7GHpJp9kcGQ7Tm0zF9jJE X-Authority-Analysis: v=2.4 cv=H5Tbw/Yi c=1 sm=1 tr=0 ts=683f8719 b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=JfrnYn6hAAAA:8 a=yPCof4ZbAAAA:8 a=kB6fCMDy4_io11iu5P8A:9 a=1CNFftbPRP8L7MoqJWF3:22 cc=ntf awl=host:14714 Content-Type: text/plain; charset="utf-8" Add support for a thread to request extending its execution time slice on the cpu. The extra cpu time granted would help in allowing the thread to complete executing the critical section and drop any locks without getting preempted. The thread would request this cpu time extension, by setting a bit in the restartable sequences(rseq) structure registered with the kernel. Kernel will grant a 30us extension on the cpu, when it sees the bit set. With the help of a timer, kernel force preempts the thread if it is still running on the cpu when the 30us timer expires. The thread should yield the cpu by making a system call after completing the critical section. Suggested-by: Peter Ziljstra Signed-off-by: Prakash Sangappa --- v4: - Changed default sched delay extension time to 30us v3: - Rename rseq_sched_delay -> sched_time_delay and move near other bits in struct task_struct. - Use IS_ENABLED() check to access 'sched_time_delay' instead of #ifdef - Modify coment describing RSEQ_CS_FLAG_DELAY_RESCHED flag. - Remove rseq_delay_resched_tick() call from hrtick_clear(). v2: - Add check in syscall_exit_to_user_mode_prepare() and reschedule if thread has 'rseq_sched_delay' set. --- include/linux/entry-common.h | 11 +++++-- include/linux/sched.h | 16 +++++++++++ include/uapi/linux/rseq.h | 7 +++++ kernel/entry/common.c | 19 ++++++++---- kernel/rseq.c | 56 ++++++++++++++++++++++++++++++++++++ kernel/sched/core.c | 14 +++++++++ kernel/sched/syscalls.c | 5 ++++ 7 files changed, 120 insertions(+), 8 deletions(-) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index fc61d0205c97..cec343f95210 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -303,7 +303,8 @@ void arch_do_signal_or_restart(struct pt_regs *regs); * exit_to_user_mode_loop - do any pending work before leaving to user spa= ce */ unsigned long exit_to_user_mode_loop(struct pt_regs *regs, - unsigned long ti_work); + unsigned long ti_work, + bool irq); =20 /** * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required @@ -315,7 +316,8 @@ unsigned long exit_to_user_mode_loop(struct pt_regs *re= gs, * EXIT_TO_USER_MODE_WORK are set * 4) check that interrupts are still disabled */ -static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) +static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs, + bool irq) { unsigned long ti_work; =20 @@ -326,7 +328,10 @@ static __always_inline void exit_to_user_mode_prepare(= struct pt_regs *regs) =20 ti_work =3D read_thread_flags(); if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK)) - ti_work =3D exit_to_user_mode_loop(regs, ti_work); + ti_work =3D exit_to_user_mode_loop(regs, ti_work, irq); + + if (irq) + rseq_delay_resched_fini(); =20 arch_exit_to_user_mode_prepare(regs, ti_work); =20 diff --git a/include/linux/sched.h b/include/linux/sched.h index c08fd199be4e..14bf0508bfca 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -339,6 +339,7 @@ extern int __must_check io_schedule_prepare(void); extern void io_schedule_finish(int token); extern long io_schedule_timeout(long timeout); extern void io_schedule(void); +extern void hrtick_local_start(u64 delay); =20 /* wrapper function to trace from this header file */ DECLARE_TRACEPOINT(sched_set_state_tp); @@ -1044,6 +1045,7 @@ struct task_struct { /* delay due to memory thrashing */ unsigned in_thrashing:1; #endif + unsigned sched_time_delay:1; #ifdef CONFIG_PREEMPT_RT struct netdev_xmit net_xmit; #endif @@ -2249,6 +2251,20 @@ static inline bool owner_on_cpu(struct task_struct *= owner) unsigned long sched_cpu_util(int cpu); #endif /* CONFIG_SMP */ =20 +#ifdef CONFIG_RSEQ + +extern bool rseq_delay_resched(void); +extern void rseq_delay_resched_fini(void); +extern void rseq_delay_resched_tick(void); + +#else + +static inline bool rseq_delay_resched(void) { return false; } +static inline void rseq_delay_resched_fini(void) { } +static inline void rseq_delay_resched_tick(void) { } + +#endif + #ifdef CONFIG_SCHED_CORE extern void sched_core_free(struct task_struct *tsk); extern void sched_core_fork(struct task_struct *p); diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index c233aae5eac9..25fc636b17d5 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -26,6 +26,7 @@ enum rseq_cs_flags_bit { RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT =3D 0, RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT =3D 1, RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT =3D 2, + RSEQ_CS_FLAG_DELAY_RESCHED_BIT =3D 3, }; =20 enum rseq_cs_flags { @@ -35,6 +36,8 @@ enum rseq_cs_flags { (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT), RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =3D (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), + RSEQ_CS_FLAG_DELAY_RESCHED =3D + (1U << RSEQ_CS_FLAG_DELAY_RESCHED_BIT), }; =20 /* @@ -128,6 +131,10 @@ struct rseq { * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE * Inhibit instruction sequence block restart on migration for * this thread. + * - RSEQ_CS_FLAG_DELAY_RESCHED + * Request by user thread to delay preemption. With use + * of a timer, kernel grants extra cpu time upto 30us for this + * thread before being rescheduled. */ __u32 flags; =20 diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 20154572ede9..b26adccb32df 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -88,7 +88,8 @@ void __weak arch_do_signal_or_restart(struct pt_regs *reg= s) { } * @ti_work: TIF work flags as read by the caller */ __always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs, - unsigned long ti_work) + unsigned long ti_work, + bool irq) { /* * Before returning to user space ensure that all pending work @@ -98,8 +99,12 @@ __always_inline unsigned long exit_to_user_mode_loop(str= uct pt_regs *regs, =20 local_irq_enable_exit_to_user(ti_work); =20 - if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) - schedule(); + if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) { + if (irq && rseq_delay_resched()) + clear_tsk_need_resched(current); + else + schedule(); + } =20 if (ti_work & _TIF_UPROBE) uprobe_notify_resume(regs); @@ -184,6 +189,10 @@ static void syscall_exit_to_user_mode_prepare(struct p= t_regs *regs) =20 CT_WARN_ON(ct_state() !=3D CT_STATE_KERNEL); =20 + /* reschedule if sched delay was granted */ + if (IS_ENABLED(CONFIG_RSEQ) && current->sched_time_delay) + set_tsk_need_resched(current); + if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr)) local_irq_enable(); @@ -204,7 +213,7 @@ static __always_inline void __syscall_exit_to_user_mode= _work(struct pt_regs *reg { syscall_exit_to_user_mode_prepare(regs); local_irq_disable_exit_to_user(); - exit_to_user_mode_prepare(regs); + exit_to_user_mode_prepare(regs, false); } =20 void syscall_exit_to_user_mode_work(struct pt_regs *regs) @@ -228,7 +237,7 @@ noinstr void irqentry_enter_from_user_mode(struct pt_re= gs *regs) noinstr void irqentry_exit_to_user_mode(struct pt_regs *regs) { instrumentation_begin(); - exit_to_user_mode_prepare(regs); + exit_to_user_mode_prepare(regs, true); instrumentation_end(); exit_to_user_mode(); } diff --git a/kernel/rseq.c b/kernel/rseq.c index b7a1ec327e81..dba44ca9f624 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -448,6 +448,62 @@ void __rseq_handle_notify_resume(struct ksignal *ksig,= struct pt_regs *regs) force_sigsegv(sig); } =20 +bool rseq_delay_resched(void) +{ + struct task_struct *t =3D current; + u32 flags; + + if (!IS_ENABLED(CONFIG_SCHED_HRTICK)) + return false; + + if (!t->rseq) + return false; + + if (t->sched_time_delay) + return false; + + if (copy_from_user_nofault(&flags, &t->rseq->flags, sizeof(flags))) + return false; + + if (!(flags & RSEQ_CS_FLAG_DELAY_RESCHED)) + return false; + + flags &=3D ~RSEQ_CS_FLAG_DELAY_RESCHED; + if (copy_to_user_nofault(&t->rseq->flags, &flags, sizeof(flags))) + return false; + + t->sched_time_delay =3D 1; + + return true; +} + +void rseq_delay_resched_fini(void) +{ +#ifdef CONFIG_SCHED_HRTICK + extern void hrtick_local_start(u64 delay); + struct task_struct *t =3D current; + /* + * IRQs off, guaranteed to return to userspace, start timer on this CPU + * to limit the resched-overdraft. + * + * If your critical section is longer than 30 us you get to keep the + * pieces. + */ + if (t->sched_time_delay) + hrtick_local_start(30 * NSEC_PER_USEC); +#endif +} + +void rseq_delay_resched_tick(void) +{ +#ifdef CONFIG_SCHED_HRTICK + struct task_struct *t =3D current; + + if (t->sched_time_delay) + set_tsk_need_resched(t); +#endif +} + #ifdef CONFIG_DEBUG_RSEQ =20 /* diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4de24eefe661..8c8960245ec0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -844,6 +844,8 @@ static enum hrtimer_restart hrtick(struct hrtimer *time= r) =20 WARN_ON_ONCE(cpu_of(rq) !=3D smp_processor_id()); =20 + rseq_delay_resched_tick(); + rq_lock(rq, &rf); update_rq_clock(rq); rq->donor->sched_class->task_tick(rq, rq->curr, 1); @@ -917,6 +919,16 @@ void hrtick_start(struct rq *rq, u64 delay) =20 #endif /* CONFIG_SMP */ =20 +void hrtick_local_start(u64 delay) +{ + struct rq *rq =3D this_rq(); + struct rq_flags rf; + + rq_lock(rq, &rf); + hrtick_start(rq, delay); + rq_unlock(rq, &rf); +} + static void hrtick_rq_init(struct rq *rq) { #ifdef CONFIG_SMP @@ -6722,6 +6734,8 @@ static void __sched notrace __schedule(int sched_mode) picked: clear_tsk_need_resched(prev); clear_preempt_need_resched(); + if (IS_ENABLED(CONFIG_RSEQ)) + prev->sched_time_delay =3D 0; rq->last_seen_need_resched_ns =3D 0; =20 is_switch =3D prev !=3D next; diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index cd38f4e9899d..1b2b64fe0fb1 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -1378,6 +1378,11 @@ static void do_sched_yield(void) */ SYSCALL_DEFINE0(sched_yield) { + if (IS_ENABLED(CONFIG_RSEQ) && current->sched_time_delay) { + schedule(); + return 0; + } + do_sched_yield(); return 0; } --=20 2.43.5 From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E8AA231842 for ; Tue, 3 Jun 2025 23:37:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993838; cv=none; b=Jv/A4hRqp2S4LWxPt60tSQqLsmfcRDrFoR43tyGy12S9alcevFM2JIt+oi+eVHuHf/YeCCJTN3mhGRiZYh+pX30NTHZnnTEHsfVwd9hA31ozsalJj1l8UzBXVwwNEsY1zm1qgEsi8DlaGZosRJmE4dmNaDPinI2CcbDw3s7uE8I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993838; c=relaxed/simple; bh=Hvp3nU+HjLeWs761iA7FXQjwIa2uP2Ctm8FYDl4ullI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZNI31vdiCgr1AvwVSIPl3UczxYqqiJq7Abcv8mP2YXVbgufOnhg3KThtYsLfNFGnwOfzIsSjDe8XloO+OCzWN6JiY0L/D5Yq9YW60ouTcOinFKO7oRMUeGHy2geBtI7H/3dhLswTJDO0/nAD6Nlu/sm+wL1JgukQ5Y5HJGGEqmY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=NVC4BJSg; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="NVC4BJSg" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MN19R032342; Tue, 3 Jun 2025 23:36:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=ynYPe KlKONZVtL6WidNA+27i1VPQsBZGopRVJY9HXyw=; b=NVC4BJSgJOBQAmPfVBoBj C3NqM6cC6eBe/XqlDt3MLck504jRCWz4t3OQgwTEDYTb4jQemcRCJTtxKqZhrcIs vBwN1VoJw5eVjHG7AECgXkPU5FlaNnKKP3SU/NPLBOpCuSU8ctnNWvJvHqrp3TpX ZvVlasOyB7xKaPfzpOOitABGNqpMMdTvxe8OH9YLCQ+llhPcm3K15YXqU7KTPWmf w3tlIBEmQXD9PTOXEKV7kmv3B4KpqpV7ADdS1Wtw2CeENlXIjyBN4LZhUNjkkUqV yly1zIiJFI0jEezHXF41QAxkKP5qvW6j2deQYfZwMn2hkWSM74Qq5WksScIEc6ae g== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8dty38-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:57 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHqu030803; Tue, 3 Jun 2025 23:36:56 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:56 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 2/6] Sched: Indicate if thread got rescheduled Date: Tue, 3 Jun 2025 23:36:50 +0000 Message-ID: <20250603233654.1838967-3-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Authority-Analysis: v=2.4 cv=Va/3PEp9 c=1 sm=1 tr=0 ts=683f8719 b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=yPCof4ZbAAAA:8 a=O1hS_E6ypt93dt2FiYsA:9 cc=ntf awl=host:14714 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfX2OeChnao1J7Y VA7ijZpMxXHofJ8z7l5qAoen+idFpwK9w13iUCrjaUClADxQVRiGwxegtqQvoWXQxVHXKyMdnDJ MpmxV5TV2MPEFFCETS/7VnnENk/H3lbPnhdatihZuulXSxx/b6hwokAZPHkei1Ee4UptRAWnAH9 QOg6ZBWTsGlUjqYLpW4cIMhLGZxtK6qzD0s7SBoy3vwsuCGqdGMkBCpSdYRJsiVfOwBt45vzP1E /aqgynEqHt87XbH1fTNF1vFV4uRPvavWikCHS7lESjAsArMwvMTCRct0Eta05UwLKRaFi+SmS4J XfDK1qOxPH+wy5uaVoNj38ahZ5NB+DOv0KlLdPPnONomNVRcxYefh3b20L/1dsAANNUjg1/vUKY SUnyQOLvyQlM3iAZ7CAxfgogQcVsNkiWnlq/3vIV4BRT7s0hUyzbQow0Tv8fROTNpUX5KXb2 X-Proofpoint-ORIG-GUID: r_boophj7bVoufLM6475I_WTgPWaZ3zu X-Proofpoint-GUID: r_boophj7bVoufLM6475I_WTgPWaZ3zu Content-Type: text/plain; charset="utf-8" Use a bit in rseq flags to indicate if the thread got rescheduled after the cpu time extension was graned. The user thread can check this flag before calling sched_yield() to yield the cpu. Signed-off-by: Prakash Sangappa --- include/linux/sched.h | 2 ++ include/uapi/linux/rseq.h | 10 ++++++++++ kernel/rseq.c | 20 ++++++++++++++++++++ kernel/sched/core.c | 3 +-- 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 14bf0508bfca..71e6c8221c1e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2256,12 +2256,14 @@ unsigned long sched_cpu_util(int cpu); extern bool rseq_delay_resched(void); extern void rseq_delay_resched_fini(void); extern void rseq_delay_resched_tick(void); +extern void rseq_delay_schedule(void); =20 #else =20 static inline bool rseq_delay_resched(void) { return false; } static inline void rseq_delay_resched_fini(void) { } static inline void rseq_delay_resched_tick(void) { } +static inline void rseq_delay_schedule(void) { } =20 #endif =20 diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index 25fc636b17d5..f4813d931387 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -27,6 +27,7 @@ enum rseq_cs_flags_bit { RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT =3D 1, RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT =3D 2, RSEQ_CS_FLAG_DELAY_RESCHED_BIT =3D 3, + RSEQ_CS_FLAG_RESCHEDULED_BIT =3D 4, }; =20 enum rseq_cs_flags { @@ -38,6 +39,9 @@ enum rseq_cs_flags { (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), RSEQ_CS_FLAG_DELAY_RESCHED =3D (1U << RSEQ_CS_FLAG_DELAY_RESCHED_BIT), + RSEQ_CS_FLAG_RESCHEDULED =3D + (1U << RSEQ_CS_FLAG_RESCHEDULED_BIT), + }; =20 /* @@ -135,6 +139,12 @@ struct rseq { * Request by user thread to delay preemption. With use * of a timer, kernel grants extra cpu time upto 30us for this * thread before being rescheduled. + * - RSEQ_CS_FLAG_RESCHEDULED + * Set by kernel if the thread was rescheduled in the extra time + * granted due to request RSEQ_CS_DELAY_RESCHED. This bit is + * checked by the thread before calling sched_yield() to yield + * cpu. User thread sets this bit to 0, when setting + * RSEQ_CS_DELAY_RESCHED to request preemption delay. */ __u32 flags; =20 diff --git a/kernel/rseq.c b/kernel/rseq.c index dba44ca9f624..9355654e9b38 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -504,6 +504,26 @@ void rseq_delay_resched_tick(void) #endif } =20 +void rseq_delay_schedule(void) +{ +#ifdef CONFIG_SCHED_HRTICK + struct task_struct *t =3D current; + u32 flags; + + if (t->sched_time_delay) { + t->sched_time_delay =3D 0; + if (!t->rseq) + return; + if (copy_from_user_nofault(&flags, &t->rseq->flags, + sizeof(flags))) + return; + flags |=3D RSEQ_CS_FLAG_RESCHEDULED; + copy_to_user_nofault(&t->rseq->flags, &flags, + sizeof(flags)); + } +#endif +} + #ifdef CONFIG_DEBUG_RSEQ =20 /* diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8c8960245ec0..86583fb72914 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6734,8 +6734,7 @@ static void __sched notrace __schedule(int sched_mode) picked: clear_tsk_need_resched(prev); clear_preempt_need_resched(); - if (IS_ENABLED(CONFIG_RSEQ)) - prev->sched_time_delay =3D 0; + rseq_delay_schedule(); rq->last_seen_need_resched_ns =3D 0; =20 is_switch =3D prev !=3D next; --=20 2.43.5 From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1773D2528E4 for ; Tue, 3 Jun 2025 23:37:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; cv=none; b=aEP9j9FpRE5ewmy+H1ZhLd+hfUR9Z4815QP2UBjWf14VxwvjDpX2kJdMGGOb4jxuwaoRvsLMhzSz7esFs40feQvGPmiZRxJg5gy+CvfRKM58+i8mUC0seVyVLeDerLSw4cC+vVM6Wq5kOoNKY/jsjf2LpHQuoPpBCXMehgkUYdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; c=relaxed/simple; bh=fAMyyogiuCL2Mk8isq78RmOsKGd3MG8SW7HHpIlVtKI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U2lr9JoYZJNSEDaRhmDVlZj897OXzN9h6pEV+VCzNGnVuKnuGoQzOabCVePLRVTPPM4JB2NnvSogrt2AvcpFYbufnUNlsrLbTEwilOXTtOdq1R4p85uF8S4l2JyTIuI6I+Nfy/T7tNVmpxwFEaPxJo5KuqJUCuTgiz0q40CwbKs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=sHmtJUMG; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="sHmtJUMG" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MNSlE000917; Tue, 3 Jun 2025 23:36:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=2YYge WCe3uS9GhsMAZV94521B63BCd0/tI0g0txHXag=; b=sHmtJUMGuTUXvU6CMjkZv XZW7eRMNAB0yjxlZQby3nHbgwS+GrqYvAoOFkyDltjadgqNqGJdUkYJxmH+O5udC 6nLBVmOBwUYlOShKvMGK1QGSjbP1Ijb38kMazoPyd+pJgqQavKPb1TFk6VxHUc+o 4iOWmwHkWMyndQTegIwuLoyvfRm/nhd8eK5FLwAKf7EnDIQ/EVu0l2pdIp3t/+c5 38MblPe+h/hsreodV1f/O1KfiRaPn0JH6aSktz95J528QcEyH8VIfkL4WFGP5EP3 GpVQ9lxd5UOQHFxE2KEEf3dDaDJz/sKSD47/rYz8SDbWN1CcdJFYXMtlaIsBE5Hz A== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8dty3a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:57 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHqv030803; Tue, 3 Jun 2025 23:36:57 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:57 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 3/6] Sched: Tunable to specify duration of time slice extension Date: Tue, 3 Jun 2025 23:36:51 +0000 Message-ID: <20250603233654.1838967-4-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Authority-Analysis: v=2.4 cv=Va/3PEp9 c=1 sm=1 tr=0 ts=683f8719 b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=yPCof4ZbAAAA:8 a=ggT4wvkgqP9MO6PAK6EA:9 cc=ntf awl=host:14714 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfX0N1joMf9+m+O Lf30RC9DU3+F+wCS4eeN+/RwYAO7BCkx9O0+v6Pq+cM1VbbFWpQBD2cBBeRDRU0BCXHyx+BGGM5 Py34vN2QDIUNd544ZF9bVk/buyENTIjjSLBOcuk/+vfRGpDz1hVtB9jLZ0SV6DWNW0ubEq+gS8i uZAuodc1v1EC2X+7LOx2Lqxn05rjHaA9RIDxHs4vP1FIf4u5tg+W//zyRfF8x2PD7V4tj4Ty/d7 Du/pQ8Ge9tbDgZ0oa0QZ6ZsqUz+SdUFNZ+M/F/LRDRrU6Cml6HplOZU4tjFP7Lz4EnVzOMVflmB qMwHASIMeTaXFRFCysVobOivIt6BrlGMxv9Xa7ev4cU03UIa4oRejhvAIGeh2GwHwaQVuLpOVt0 q+47wrMpconfI7nkNTjKUkntOvSZsIwN5OXZNQ1gDprJ9uKVOHrwvBTMAJ3tUePJvgxTAtNK X-Proofpoint-ORIG-GUID: M_ckvyiOuLQ4o5bfQ9roZot7MC_w7XTY X-Proofpoint-GUID: M_ckvyiOuLQ4o5bfQ9roZot7MC_w7XTY Content-Type: text/plain; charset="utf-8" Add a tunable to specify duration of scheduler time slice extension. The default will be set to 30us and the max value that can be specified is 100us. Setting it to 0, disables scheduler time slice extension. Signed-off-by: Prakash Sangappa --- v5: - Added #ifdef CONFIG_RSEQ & CONFIG_PROC_SYSCTL=20 --- include/linux/sched.h | 5 +++++ include/uapi/linux/rseq.h | 5 +++-- kernel/rseq.c | 7 +++++-- kernel/sched/core.c | 40 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 53 insertions(+), 4 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 71e6c8221c1e..14069ebe26e2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -407,6 +407,11 @@ static inline void sched_domains_mutex_lock(void) { } static inline void sched_domains_mutex_unlock(void) { } #endif =20 +#ifdef CONFIG_RSEQ +/* Scheduler time slice extension */ +extern unsigned int sysctl_sched_preempt_delay_us; +#endif + struct sched_param { int sched_priority; }; diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index f4813d931387..015534f064af 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -137,8 +137,9 @@ struct rseq { * this thread. * - RSEQ_CS_FLAG_DELAY_RESCHED * Request by user thread to delay preemption. With use - * of a timer, kernel grants extra cpu time upto 30us for this - * thread before being rescheduled. + * of a timer, kernel grants extra cpu time upto the tunable + * 'sched_preempt_delay_us' value for this thread before it gets + * rescheduled. * - RSEQ_CS_FLAG_RESCHEDULED * Set by kernel if the thread was rescheduled in the extra time * granted due to request RSEQ_CS_DELAY_RESCHED. This bit is diff --git a/kernel/rseq.c b/kernel/rseq.c index 9355654e9b38..44d0f3ae0cd3 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -456,6 +456,8 @@ bool rseq_delay_resched(void) if (!IS_ENABLED(CONFIG_SCHED_HRTICK)) return false; =20 + if (!sysctl_sched_preempt_delay_us) + return false; if (!t->rseq) return false; =20 @@ -489,8 +491,9 @@ void rseq_delay_resched_fini(void) * If your critical section is longer than 30 us you get to keep the * pieces. */ - if (t->sched_time_delay) - hrtick_local_start(30 * NSEC_PER_USEC); + if (sysctl_sched_preempt_delay_us && t->sched_time_delay) + hrtick_local_start(sysctl_sched_preempt_delay_us * + NSEC_PER_USEC); #endif } =20 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 86583fb72914..e5307389b30a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -148,6 +148,17 @@ __read_mostly int sysctl_resched_latency_warn_once =3D= 1; */ __read_mostly unsigned int sysctl_sched_nr_migrate =3D SCHED_NR_MIGRATE_BR= EAK; =20 +#ifdef CONFIG_RSEQ +/* + * Scheduler time slice extension, duration in microsecs. + * Max value allowed 100us, default is 30us. + * If set to 0, scheduler time slice extension is disabled. + */ +#define SCHED_PREEMPT_DELAY_DEFAULT_US 30 +__read_mostly unsigned int sysctl_sched_preempt_delay_us =3D + SCHED_PREEMPT_DELAY_DEFAULT_US; +#endif + __read_mostly int scheduler_running; =20 #ifdef CONFIG_SCHED_CORE @@ -4664,6 +4675,24 @@ static int sysctl_schedstats(const struct ctl_table = *table, int write, void *buf #endif /* CONFIG_PROC_SYSCTL */ #endif /* CONFIG_SCHEDSTATS */ =20 +#ifdef CONFIG_PROC_SYSCTL +#ifdef CONFIG_RSEQ +static int sysctl_sched_preempt_delay(const struct ctl_table *table, int w= rite, + void *buffer, size_t *lenp, loff_t *ppos) +{ + int err; + + err =3D proc_dointvec_minmax(table, write, buffer, lenp, ppos); + if (err < 0) + return err; + if (sysctl_sched_preempt_delay_us > SCHED_PREEMPT_DELAY_DEFAULT_US) + pr_warn("Sched preemption delay time set higher then default value %d us= \n", + SCHED_PREEMPT_DELAY_DEFAULT_US); + return err; +} +#endif /* CONFIG_RSEQ */ +#endif /* CONFIG_PROC_SYSCTL */ + #ifdef CONFIG_SYSCTL static const struct ctl_table sched_core_sysctls[] =3D { #ifdef CONFIG_SCHEDSTATS @@ -4711,6 +4740,17 @@ static const struct ctl_table sched_core_sysctls[] = =3D { .extra2 =3D SYSCTL_FOUR, }, #endif /* CONFIG_NUMA_BALANCING */ +#ifdef CONFIG_RSEQ + { + .procname =3D "sched_preempt_delay_us", + .data =3D &sysctl_sched_preempt_delay_us, + .maxlen =3D sizeof(unsigned int), + .mode =3D 0644, + .proc_handler =3D sysctl_sched_preempt_delay, + .extra1 =3D SYSCTL_ZERO, + .extra2 =3D SYSCTL_ONE_HUNDRED, + }, +#endif /* CONFIG_RSEQ */ }; static int __init sched_core_sysctl_init(void) { --=20 2.43.5 From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C91592522B5 for ; Tue, 3 Jun 2025 23:37:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; cv=none; b=ZK1ctM2G5pz4p4hqSm2WTk5I2155T08gjV2NzAZDhLLj8PU0wq7AnO4j5xX/RWx8IykZXMFXfYec7jnV8S1mXHNa1m+cwAKq1I+7doA6fo7m+93t8cXjogaFw56uW02Y9e4tG9Y5lAlRPC/Wg+ricBftb6I3pOa51bO8pLOvHPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993840; c=relaxed/simple; bh=2uL+0Zwp87pysajm/okX2HDO9RisreP4PWYFKZElQS0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WceCEzFkh9hiGMF57q/McmMiimFm63GhBeE/nDMJBpgJfrJeIFzn7Ntn7Eu9cMqjBdhPG4xWOV1gBqrlYfEbeQHzSXuDsGJP1qcEbNDB0SR2+IZVVyFYcDFy6txSUwQiFG4c2/EzvLto8su6QvhihKG9STSH+ULJxHhVpmmCZU4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=piuOmFzT; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="piuOmFzT" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MNSt2000913; Tue, 3 Jun 2025 23:36:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=6hBEg VpVgl7/BUSASlqOj6zL7f7ZoKMd7QZmJcNgWeU=; b=piuOmFzTfIxiYRpXOXtZD 6vWN93OjnncJMrBhLCAlNVTUhACh0g4S0hY3h+Dds+bZazptO+F4jMVVZTRG8lQ9 4vjmwuujbDzMBNp0i0IDNXkE1e424RCnoMsEFFbFWFy8LnliK7/rqGTp3WqGyxTT XeKr9fGhk7RW4Ia8VbQ+k9rrza1f3A0WYtYlGPoFVdVf+BUgEz5/j7SRj+I/HKno vHdR6E8Cf86oYcWqh50p7VJY2qvGo6FzgNZT7boI7hghoav1TuuT6RH56+XHa6Lt /7ZZu7JZpo4Rq/gXSEkVkUSXKODxZeUUi1FU9HDse+4250EluJAkHnvW2DIX4D0O Q== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8dty3b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:58 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHqw030803; Tue, 3 Jun 2025 23:36:57 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:57 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 4/6] Sched: Add scheduler stat for cpu time slice extension Date: Tue, 3 Jun 2025 23:36:52 +0000 Message-ID: <20250603233654.1838967-5-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Authority-Analysis: v=2.4 cv=Va/3PEp9 c=1 sm=1 tr=0 ts=683f871a b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=yPCof4ZbAAAA:8 a=vUPmPsuNeZtoDWJSgW8A:9 cc=ntf awl=host:14714 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfX0DgZu18QMrk6 82rEYOm5HYxXlegybGoJ31/FPOVC+POt3tF7yXAAuEXOtvRj1cG1zVzB8GKf03MPp+xhnGim1Kq sign1lqGp2QrXxZ1t1HYAQQ6U8lGWRJ9+U480/tdSaAtZyTYHUVMIEzFOsB3HXdQei47n1DD6U3 LrsWDAl4nEpap3BqpkySPJR1XTt2JlCrjrFcu9UUc1YRV4qoUZtJ3rY/XMhjz/p6FehbE+P7jBo btj+CMgz5blI0NMmpIhjzFPFe2So2anz/8u8+Mhk7S70fkDkHt9K4PHuCyh/JzRDLWMipuPLRpR 3FGbtC6cc4RJeFlkk3NDydO8uWBq9Cj4BAG9Zx7riw81XlQ+LHvkrLbKvCNVvfSLRbM80vmWcji 5ro+OQQ/9Y9NBHQ4F78yGZxGSGwdq5Mu6SYfF7QViv35+uk1kO+t1lRq4ANvowb+y+Swwpyb X-Proofpoint-ORIG-GUID: 21ladm4DMoryhfZtKrzY7nhyoq5f47-9 X-Proofpoint-GUID: 21ladm4DMoryhfZtKrzY7nhyoq5f47-9 Content-Type: text/plain; charset="utf-8" Add scheduler stat to record number of times the thread was granted cpu time slice extension. Signed-off-by: Prakash Sangappa --- v5: - Added #ifdef CONFIG_RSEQ --- include/linux/sched.h | 7 +++++++ kernel/rseq.c | 1 + kernel/sched/core.c | 7 +++++++ kernel/sched/debug.c | 4 ++++ 4 files changed, 19 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 14069ebe26e2..6c2e9d30c2fc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -340,6 +340,9 @@ extern void io_schedule_finish(int token); extern long io_schedule_timeout(long timeout); extern void io_schedule(void); extern void hrtick_local_start(u64 delay); +#ifdef CONFIG_RSEQ +extern void update_stat_preempt_delayed(struct task_struct *t); +#endif =20 /* wrapper function to trace from this header file */ DECLARE_TRACEPOINT(sched_set_state_tp); @@ -566,6 +569,10 @@ struct sched_statistics { u64 nr_wakeups_passive; u64 nr_wakeups_idle; =20 +#ifdef CONFIG_RSEQ + u64 nr_preempt_delay_granted; +#endif + #ifdef CONFIG_SCHED_CORE u64 core_forceidle_sum; #endif diff --git a/kernel/rseq.c b/kernel/rseq.c index 44d0f3ae0cd3..c4bc52f8ba9c 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -475,6 +475,7 @@ bool rseq_delay_resched(void) return false; =20 t->sched_time_delay =3D 1; + update_stat_preempt_delayed(t); =20 return true; } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e5307389b30a..95fce557a294 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -940,6 +940,13 @@ void hrtick_local_start(u64 delay) rq_unlock(rq, &rf); } =20 +#ifdef CONFIG_RSEQ +void update_stat_preempt_delayed(struct task_struct *t) +{ + schedstat_inc(t->stats.nr_preempt_delay_granted); +} +#endif + static void hrtick_rq_init(struct rq *rq) { #ifdef CONFIG_SMP diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 4cba21f5d24d..b178cb0e2904 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -1217,6 +1217,10 @@ void proc_sched_show_task(struct task_struct *p, str= uct pid_namespace *ns, P_SCHEDSTAT(nr_wakeups_passive); P_SCHEDSTAT(nr_wakeups_idle); =20 +#ifdef CONFIG_RSEQ + P_SCHEDSTAT(nr_preempt_delay_granted); +#endif + avg_atom =3D p->se.sum_exec_runtime; if (nr_switches) avg_atom =3D div64_ul(avg_atom, nr_switches); --=20 2.43.5 From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1767E2522BA for ; Tue, 3 Jun 2025 23:37:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993841; cv=none; b=SA96LhJ/xVguT63StU2sNjzNq6Hworq11XU2BafrKFTn+vG9Y39F5ma/lmhX4JTS6JwU4XwVdMdkPJfNlb9c0uMiHXKqKT2rPSsFP1/mvbKEJgjlPSSfQBrAY6LBUOQb36zCAjYC3Kg5bKSUPPQWSTr7EKTHIXGo+60GJpL6NuE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993841; c=relaxed/simple; bh=RwqSclDMY6GfCaZEU5c53IrFIXQSlzkanBN9AoDM/s4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=A2Hux1gH9HQ/Nv1CkHsDTBeFGyt7mr17bkv8KqGMCtuc0a033YZdqYdc7wu2dU7O7XI7qef9oq9jVBt5T5g9F15NdlHx02+3FVeNBl+K9ModyA30+/obpkOBma1H8SVlvyZHG6ORLWDX8KrgGFSC7+/D5bHBELqh4aMDikCU0Oc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=DjvwatAU; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="DjvwatAU" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MN5EB003886; Tue, 3 Jun 2025 23:36:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=fP7NQ M9rSpI6g19shnrw/5oiAfJEIKQ6mCauSx8K+74=; b=DjvwatAUc1YiiWNIuxWmd bo2luZaIT197HsYg8+MSk8hwhKrz55fFaaWK8kVXSsMCOtqSbvYOmI9sSg8Q4IrJ vVM/k837MwoDwtjjm2Or4oE/ZPPbm7UlSHnckOS88LuIwkdpdY5ieXmHU6Saxbtc BqxCqJXG7FnqxXnCXUzlK+Hq2hONIRyl48z2n628fa/WRqiiJvyPvQalgQYzgYWl 4N2OZR78n0sOS6Enagjx/NWddopKVAKinU98mLfivJAu2UNXYI+6YOy8Voy03qcN w8e5kvvvRffHVoZPHXAKW/YYQAPzVfwN+q3aomHn2ibfv8UjpvO6dkq1Yxe3SdAr Q== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8gawfe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:58 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHqx030803; Tue, 3 Jun 2025 23:36:57 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:57 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 5/6] Sched: Add tracepoint for sched time slice extension Date: Tue, 3 Jun 2025 23:36:53 +0000 Message-ID: <20250603233654.1838967-6-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfX9QkN12Wfj0l0 KdNI0eJuMyS40hXodUPD6SJO8Zj4xkeAnvTasXXxMdLB711gXFvOIdcqlv6poqjy448uSmS3emu g/zLQ0jg0ZfjmDNy1tbjfx/ntHJXuKoiFBG1SRIkYpqlwZ3LKb4VT1yxNWgVS/z7bfVd+9adMFc 79HzP22TnEfT+SpnZQSXelPDU4PP+IwNn088kqUzocPwZxGXjk7R8Ii6jLsU/6JpG3fX0EYF37V mdiTDQ08ULP1+0CZtnXyLLr1b5EM21Upl+g3dUO8RauGYaEq6zfeAWlaZw2FyfS9ck3Fv8JTgJa XqZuUt6BuOQfe5QTTY0EKiitejd0xJrTAJeZ0RXc64iwS+hEnlmJrpkiYFtzfm3DBmnLiR+uAr/ lKHYDbnF0/tq6+JpJDCJGLGo9qmL0sOoq3qYEMlu/1McanFNbD5Lj97pQ5USbwivvETnZf6Z X-Proofpoint-GUID: PyLxm4N2FgS7u85TSFKPdtljAHdhppx0 X-Proofpoint-ORIG-GUID: PyLxm4N2FgS7u85TSFKPdtljAHdhppx0 X-Authority-Analysis: v=2.4 cv=H5Tbw/Yi c=1 sm=1 tr=0 ts=683f871a b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=yPCof4ZbAAAA:8 a=-a2y3q_jHnAO8POPY3MA:9 cc=ntf awl=host:14714 Content-Type: text/plain; charset="utf-8" Trace thread's preemption getting delayed. Which can occur if the running thread requested extra time on cpu. Also, indicate the NEED_RESCHED flag, that is set on the thread, getting cleared. Suggested-by: Sebastian Andrzej Siewior Signed-off-by: Prakash Sangappa --- include/trace/events/sched.h | 28 ++++++++++++++++++++++++++++ kernel/entry/common.c | 12 ++++++++++-- 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index 8994e97d86c1..4aa04044b14a 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -296,6 +296,34 @@ TRACE_EVENT(sched_migrate_task, __entry->orig_cpu, __entry->dest_cpu) ); =20 +/* + * Tracepoint for delayed resched requested by task: + */ +TRACE_EVENT(sched_delay_resched, + + TP_PROTO(struct task_struct *p, unsigned int resched_flg), + + TP_ARGS(p, resched_flg), + + TP_STRUCT__entry( + __array( char, comm, TASK_COMM_LEN ) + __field( pid_t, pid ) + __field( int, cpu ) + __field( int, flg ) + ), + + TP_fast_assign( + memcpy(__entry->comm, p->comm, TASK_COMM_LEN); + __entry->pid =3D p->pid; + __entry->cpu =3D task_cpu(p); + __entry->flg =3D resched_flg; + ), + + TP_printk("comm=3D%s pid=3D%d cpu=3D%d resched_flg_cleared=3D0x%x", + __entry->comm, __entry->pid, __entry->cpu, __entry->flg) + +); + DECLARE_EVENT_CLASS(sched_process_template, =20 TP_PROTO(struct task_struct *p), diff --git a/kernel/entry/common.c b/kernel/entry/common.c index b26adccb32df..cd0f076920fd 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -12,6 +12,7 @@ =20 #include "common.h" =20 +#include #define CREATE_TRACE_POINTS #include =20 @@ -91,6 +92,7 @@ __always_inline unsigned long exit_to_user_mode_loop(stru= ct pt_regs *regs, unsigned long ti_work, bool irq) { + unsigned long ti_work_cleared =3D 0; /* * Before returning to user space ensure that all pending work * items have been completed. @@ -100,10 +102,12 @@ __always_inline unsigned long exit_to_user_mode_loop(= struct pt_regs *regs, local_irq_enable_exit_to_user(ti_work); =20 if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) { - if (irq && rseq_delay_resched()) + if (irq && rseq_delay_resched()) { clear_tsk_need_resched(current); - else + ti_work_cleared =3D ti_work; + } else { schedule(); + } } =20 if (ti_work & _TIF_UPROBE) @@ -134,6 +138,10 @@ __always_inline unsigned long exit_to_user_mode_loop(s= truct pt_regs *regs, ti_work =3D read_thread_flags(); } =20 + if (ti_work_cleared) + trace_sched_delay_resched(current, ti_work_cleared & + (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)); + /* Return the latest work state for arch_exit_to_user_mode() */ return ti_work; } --=20 2.43.5 From nobody Sun Feb 8 23:15:41 2026 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F1FD24DCFE for ; Tue, 3 Jun 2025 23:37:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993839; cv=none; b=Nr5KlGUx3e+IViR0B2zc2L/Wc5lbIKr6fBowZRIo483pGRSIzB5nSJ6fU+/30JiRxxQj9EWXRiDgtckVput5fAiHDQRk3DbIOFwOD3XbzGKKAgMv5ZQsIoQpDBr7ZgHnF5xhYFgyhxX8RMYhvaiE/61F+WI9fKHg2R0Kcv3FyBg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748993839; c=relaxed/simple; bh=CpW5+EnRH/eprOB2Im5mjNYvWXmMldYepx3fogdbW78=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mpW8S7FeuGzxOlpgG7NT6XPQ2YIWqQlXdHJc5AOWrydTe+KfjHKIkCOMUJJMYll8oqFNhOdQa1/el3N/BboEYAOTMsdMkS0v9rdvSGX0jtUtSw4XNLDMuvTkN3i8T37Wd22K+tg2XjENKz775bE8OqgXzTi7OtpFrWwwlVOL3NU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=TKfALx6Y; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="TKfALx6Y" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 553MN19S032342; Tue, 3 Jun 2025 23:36:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=2ziAp oYnxEqaxK0lHJjI7BRw8C3lEmMXkqJPO+VXKv8=; b=TKfALx6YhyegGOsg7CzRp M/SSTWtsTl0fRo/Z4ghCOQNus42M7cbb77ZiT+O0XaWkDVP+J2i2qjpaw1hYCoZh hdjVVu8dGuE0S650uftf7OayqDZ5ylClNoBSOhsDKdYPzYAfEOQsTjWY3JWOpl4w vwmW6vVY+kt6NGYObOzS2bVraoqOAGQ0OowVdVktB6+YmiyKkds323rrDslXLDSD OPs+4GqO6bDFka4ZgP55lF9U5G5ooAgWjHIHd0t+WoItxMr3rwhV4Lkkg+I5vFHQ o9gEy9/4P+Xn/rEqrqxSjfj69LFI/dYxK4+VMb5y9JdRV3oZDZn6/6dPL4rVKhPH w== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 471g8dty3c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Jun 2025 23:36:58 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 553MtHr0030803; Tue, 3 Jun 2025 23:36:58 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 46yr7a2j93-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jun 2025 23:36:58 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com, vineethr@linux.ibm.com Subject: [PATCH V5 6/6] Add API to query supported rseq cs flags Date: Tue, 3 Jun 2025 23:36:54 +0000 Message-ID: <20250603233654.1838967-7-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250603233654.1838967-1-prakash.sangappa@oracle.com> References: <20250603233654.1838967-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-06-03_03,2025-06-03_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2506030202 X-Authority-Analysis: v=2.4 cv=Va/3PEp9 c=1 sm=1 tr=0 ts=683f871a b=1 cx=c_pps a=e1sVV491RgrpLwSTMOnk8w==:117 a=e1sVV491RgrpLwSTMOnk8w==:17 a=6IFa9wvqVegA:10 a=7d_E57ReAAAA:8 a=yPCof4ZbAAAA:8 a=f51I_UcxYSrpky289IcA:9 a=jhqOcbufqs7Y1TYCrUUU:22 cc=ntf awl=host:14714 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNjAzMDIwMiBTYWx0ZWRfX6UgiaRWLlONQ MZ9uMgK0uDGxIEl6TpDg9YiaR4GZNuRgDk/KsVJHsSCC0deyKoAE5JoNhMXgsf6FDm5b0kZ6twt qx1ngG5gbnIh32SUkFVeNriJkTBFCg2KFLEuxBouX2yeKXM3suxYIQgKsU61SQeNNHzCMLIss05 Pkri9Wl/sxmZ3ST0DMR5ZaaJ34qov0DwKZgqAHI0ozGLJHKiGn6lIESK9ypUQO4nIJidBu26dFq Ra7yiXR7kwLlL5BPDGSnmKGmx3fO1Hin54xB/knK6JMy/I39+EMWOds22Qy6uVJ+TL9tLa1pn4p wxAaHctAiKupWpSnkKHhMJWOvm2qkVGSlzcbgehplwiYa4eZDcj7UIRox16YAC8Nl39NoZbEQwc vXSac5K2ZOLhRiNnZ8mH1xm8+kQa1OTbHeLIhOoyR8yPAn1gxZOgsqySRKskHW5/yq7/rvFN X-Proofpoint-ORIG-GUID: rlXCmcjw8u6UR_QqL1MTkPuegnRj75D0 X-Proofpoint-GUID: rlXCmcjw8u6UR_QqL1MTkPuegnRj75D0 Content-Type: text/plain; charset="utf-8" For the API, add a new flag to sys_rseq 'flags' argument called RSEQ_FLAG_QUERY_CS_FLAGS. When this flag is passed it returns a bit mask of all the supported rseq cs flags in the user provided rseq struct's 'flags' member. Suggested-by: Mathieu Desnoyers Signed-off-by: Prakash Sangappa --- v5: - Removed deprecated flags from supported cs flags returned. - Added IS_ENABLED(CONFIG_SCHED_HRTICK) --- include/uapi/linux/rseq.h | 1 + kernel/rseq.c | 16 ++++++++++++++++ 2 files changed, 17 insertions(+) diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index 015534f064af..44baea9dd10a 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -20,6 +20,7 @@ enum rseq_cpu_id_state { =20 enum rseq_flags { RSEQ_FLAG_UNREGISTER =3D (1 << 0), + RSEQ_FLAG_QUERY_CS_FLAGS =3D (1 << 1), }; =20 enum rseq_cs_flags_bit { diff --git a/kernel/rseq.c b/kernel/rseq.c index c4bc52f8ba9c..d2b010dccff5 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -576,6 +576,22 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,= rseq_len, return 0; } =20 + /* + * Return supported rseq_cs flags. + */ + if (flags & RSEQ_FLAG_QUERY_CS_FLAGS) { + u32 rseq_csflags =3D RSEQ_CS_FLAG_DELAY_RESCHED | + RSEQ_CS_FLAG_RESCHEDULED; + /* Following is required for delay resched support */ + if (!IS_ENABLED(CONFIG_SCHED_HRTICK)) + return -EINVAL; + if (!rseq) + return -EINVAL; + if (copy_to_user(&rseq->flags, &rseq_csflags, sizeof(u32))) + return -EFAULT; + return 0; + } + if (unlikely(flags)) return -EINVAL; =20 --=20 2.43.5