From nobody Fri Dec 19 06:32:57 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85A91165EFC for ; Tue, 13 May 2025 21:46:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172793; cv=none; b=WvHSTvs7Xs9tSw/RQvV5i2VJIoF0AEhqWpsAbxMJWEL6tF+QOC5d1zU591T4GpbvNWAtGcLkrtu13aNazRevPjltEdk6Y0donvKViHEVFkjw+vvMmK3kRW6T0Lg4Vuak+zRxaZvfiJRUAN/t6zRr2Rd7r+IyHbouYjjXW8xqKtg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172793; c=relaxed/simple; bh=1gF8sfxKoUcUC3oRs8OvJh0NPRa/5LiEHfgtnQToEaE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cnaRno7scu85io4XLuFNub2NVILaV3QHEBDowvpRfNwIZH450NAQnfMRcZBafO9ku5Pkq2Hecmn27jbNqcM9kfWshr/wMakzFKGV2lkihKtb05e4P1TT5stH3cVtxLZDxPvRIH/5QsZvUrEtnmdjJcLLUFzrQwliwW+E6TzQJNg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=DycCXf6E; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="DycCXf6E" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0x8Q009056; Tue, 13 May 2025 21:46:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=KjaHb 3KQbcgOb9Od8djiq9qaILMCuzZCA9rM+qMeMZc=; b=DycCXf6EC93XUt+fmvUMj 653nar36uvluRWxUEgzgDivkv0/OZ+GCPoKJtXWQGcs4CO80X5UQJta7lzQzc6HV SRN+HPnGI4XF8SI1zNQwck+NiMCoGQpGpLiPkp8IghHKztsZKswO5kQwHzaSCUp1 WwUAJvWOiCAjAxx5f0dfhyTugX7rfCs9me3MRH/Ek0SlMRbljZ3EN+6WiCxRagVE KKpQY3GU5wR7CZrSCJY8gvf9PVqg2e2juWAUVc0qgAc9GlGlJAMnKM1Cdxt7MB7W Tn69UTovTfKSHHGVpae+puebIK4ufOiM0GJfucJ8sRUG+r/3l2QNBDBYJ42KOW8H g== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbchgbbm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:45:59 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeM001890; Tue, 13 May 2025 21:45:58 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:45:58 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 1/6] Sched: Scheduler time slice extension Date: Tue, 13 May 2025 21:45:49 +0000 Message-ID: <20250513214554.4160454-2-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-GUID: ODAaxJSy5h9ny4WUSSnQNb067fsfuMhA X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfX1qpINTyQU7Id W4kZ0OsFE1pzBgzE5DYNv/KFKvAXRvLnkRTEb4mM2JLuxfozMPK6SDQjANFAxf2DpyDASspTYzh lNAGZ0eskKDkM162Z1ezWZO6h4qQwwawV+oisHYPK4iTDHvG4Wh1uXjEgJhHBGi1g9ub/ZBXVHl H4TsWIaCHFMLIZAMrhsRT9grryaysl6mF3mKgAmlPFIQQYN9XUJ2oHPtGcsx2vWHF3NsV2MaH2g 3lQaO1FHuIizCQcXGu+NBsJKgha0zjtATjrCYjL49glBFANaSY04UNsNQ+evefikFjkkF0kIYMN +HK1iJ9vYmMYUuHUkkD0/U955QlgNGQjJG3oF1i4IyTHMCNdIvfviB7hvDT9L0T5nTSuNoGYHfQ Cmof+y2rx4fIwcPPjdNMrvwQf+EPj6ftKehDfoTfbtUhGzYoeRYisPXH0pBWwJaHWeGZhPsI X-Proofpoint-ORIG-GUID: ODAaxJSy5h9ny4WUSSnQNb067fsfuMhA X-Authority-Analysis: v=2.4 cv=Da8XqutW c=1 sm=1 tr=0 ts=6823bd97 b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=JfrnYn6hAAAA:8 a=yPCof4ZbAAAA:8 a=xLJU5_iY2eMPnPBfH_8A:9 a=1CNFftbPRP8L7MoqJWF3:22 Content-Type: text/plain; charset="utf-8" Add support for a thread to request extending its execution time slice on the cpu. The extra cpu time granted would help in allowing the thread to complete executing the critical section and drop any locks without getting preempted. The thread would request this cpu time extension, by setting a bit in the restartable sequences(rseq) structure registered with the kernel. Kernel will grant a 30us extension on the cpu, when it sees the bit set. With the help of a timer, kernel force preempts the thread if it is still running on the cpu when the 30us timer expires. The thread should yield the cpu by making a system call after completing the critical section. Suggested-by: Peter Ziljstra Signed-off-by: Prakash Sangappa --- include/linux/entry-common.h | 11 +++++-- include/linux/sched.h | 16 +++++++++++ include/uapi/linux/rseq.h | 7 +++++ kernel/entry/common.c | 19 ++++++++---- kernel/rseq.c | 56 ++++++++++++++++++++++++++++++++++++ kernel/sched/core.c | 14 +++++++++ kernel/sched/syscalls.c | 5 ++++ 7 files changed, 120 insertions(+), 8 deletions(-) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index fc61d0205c97..cec343f95210 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -303,7 +303,8 @@ void arch_do_signal_or_restart(struct pt_regs *regs); * exit_to_user_mode_loop - do any pending work before leaving to user spa= ce */ unsigned long exit_to_user_mode_loop(struct pt_regs *regs, - unsigned long ti_work); + unsigned long ti_work, + bool irq); =20 /** * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required @@ -315,7 +316,8 @@ unsigned long exit_to_user_mode_loop(struct pt_regs *re= gs, * EXIT_TO_USER_MODE_WORK are set * 4) check that interrupts are still disabled */ -static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) +static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs, + bool irq) { unsigned long ti_work; =20 @@ -326,7 +328,10 @@ static __always_inline void exit_to_user_mode_prepare(= struct pt_regs *regs) =20 ti_work =3D read_thread_flags(); if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK)) - ti_work =3D exit_to_user_mode_loop(regs, ti_work); + ti_work =3D exit_to_user_mode_loop(regs, ti_work, irq); + + if (irq) + rseq_delay_resched_fini(); =20 arch_exit_to_user_mode_prepare(regs, ti_work); =20 diff --git a/include/linux/sched.h b/include/linux/sched.h index c08fd199be4e..14bf0508bfca 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -339,6 +339,7 @@ extern int __must_check io_schedule_prepare(void); extern void io_schedule_finish(int token); extern long io_schedule_timeout(long timeout); extern void io_schedule(void); +extern void hrtick_local_start(u64 delay); =20 /* wrapper function to trace from this header file */ DECLARE_TRACEPOINT(sched_set_state_tp); @@ -1044,6 +1045,7 @@ struct task_struct { /* delay due to memory thrashing */ unsigned in_thrashing:1; #endif + unsigned sched_time_delay:1; #ifdef CONFIG_PREEMPT_RT struct netdev_xmit net_xmit; #endif @@ -2249,6 +2251,20 @@ static inline bool owner_on_cpu(struct task_struct *= owner) unsigned long sched_cpu_util(int cpu); #endif /* CONFIG_SMP */ =20 +#ifdef CONFIG_RSEQ + +extern bool rseq_delay_resched(void); +extern void rseq_delay_resched_fini(void); +extern void rseq_delay_resched_tick(void); + +#else + +static inline bool rseq_delay_resched(void) { return false; } +static inline void rseq_delay_resched_fini(void) { } +static inline void rseq_delay_resched_tick(void) { } + +#endif + #ifdef CONFIG_SCHED_CORE extern void sched_core_free(struct task_struct *tsk); extern void sched_core_fork(struct task_struct *p); diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index c233aae5eac9..25fc636b17d5 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -26,6 +26,7 @@ enum rseq_cs_flags_bit { RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT =3D 0, RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT =3D 1, RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT =3D 2, + RSEQ_CS_FLAG_DELAY_RESCHED_BIT =3D 3, }; =20 enum rseq_cs_flags { @@ -35,6 +36,8 @@ enum rseq_cs_flags { (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT), RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =3D (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), + RSEQ_CS_FLAG_DELAY_RESCHED =3D + (1U << RSEQ_CS_FLAG_DELAY_RESCHED_BIT), }; =20 /* @@ -128,6 +131,10 @@ struct rseq { * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE * Inhibit instruction sequence block restart on migration for * this thread. + * - RSEQ_CS_FLAG_DELAY_RESCHED + * Request by user thread to delay preemption. With use + * of a timer, kernel grants extra cpu time upto 30us for this + * thread before being rescheduled. */ __u32 flags; =20 diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 20154572ede9..b26adccb32df 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -88,7 +88,8 @@ void __weak arch_do_signal_or_restart(struct pt_regs *reg= s) { } * @ti_work: TIF work flags as read by the caller */ __always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs, - unsigned long ti_work) + unsigned long ti_work, + bool irq) { /* * Before returning to user space ensure that all pending work @@ -98,8 +99,12 @@ __always_inline unsigned long exit_to_user_mode_loop(str= uct pt_regs *regs, =20 local_irq_enable_exit_to_user(ti_work); =20 - if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) - schedule(); + if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) { + if (irq && rseq_delay_resched()) + clear_tsk_need_resched(current); + else + schedule(); + } =20 if (ti_work & _TIF_UPROBE) uprobe_notify_resume(regs); @@ -184,6 +189,10 @@ static void syscall_exit_to_user_mode_prepare(struct p= t_regs *regs) =20 CT_WARN_ON(ct_state() !=3D CT_STATE_KERNEL); =20 + /* reschedule if sched delay was granted */ + if (IS_ENABLED(CONFIG_RSEQ) && current->sched_time_delay) + set_tsk_need_resched(current); + if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr)) local_irq_enable(); @@ -204,7 +213,7 @@ static __always_inline void __syscall_exit_to_user_mode= _work(struct pt_regs *reg { syscall_exit_to_user_mode_prepare(regs); local_irq_disable_exit_to_user(); - exit_to_user_mode_prepare(regs); + exit_to_user_mode_prepare(regs, false); } =20 void syscall_exit_to_user_mode_work(struct pt_regs *regs) @@ -228,7 +237,7 @@ noinstr void irqentry_enter_from_user_mode(struct pt_re= gs *regs) noinstr void irqentry_exit_to_user_mode(struct pt_regs *regs) { instrumentation_begin(); - exit_to_user_mode_prepare(regs); + exit_to_user_mode_prepare(regs, true); instrumentation_end(); exit_to_user_mode(); } diff --git a/kernel/rseq.c b/kernel/rseq.c index b7a1ec327e81..dba44ca9f624 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -448,6 +448,62 @@ void __rseq_handle_notify_resume(struct ksignal *ksig,= struct pt_regs *regs) force_sigsegv(sig); } =20 +bool rseq_delay_resched(void) +{ + struct task_struct *t =3D current; + u32 flags; + + if (!IS_ENABLED(CONFIG_SCHED_HRTICK)) + return false; + + if (!t->rseq) + return false; + + if (t->sched_time_delay) + return false; + + if (copy_from_user_nofault(&flags, &t->rseq->flags, sizeof(flags))) + return false; + + if (!(flags & RSEQ_CS_FLAG_DELAY_RESCHED)) + return false; + + flags &=3D ~RSEQ_CS_FLAG_DELAY_RESCHED; + if (copy_to_user_nofault(&t->rseq->flags, &flags, sizeof(flags))) + return false; + + t->sched_time_delay =3D 1; + + return true; +} + +void rseq_delay_resched_fini(void) +{ +#ifdef CONFIG_SCHED_HRTICK + extern void hrtick_local_start(u64 delay); + struct task_struct *t =3D current; + /* + * IRQs off, guaranteed to return to userspace, start timer on this CPU + * to limit the resched-overdraft. + * + * If your critical section is longer than 30 us you get to keep the + * pieces. + */ + if (t->sched_time_delay) + hrtick_local_start(30 * NSEC_PER_USEC); +#endif +} + +void rseq_delay_resched_tick(void) +{ +#ifdef CONFIG_SCHED_HRTICK + struct task_struct *t =3D current; + + if (t->sched_time_delay) + set_tsk_need_resched(t); +#endif +} + #ifdef CONFIG_DEBUG_RSEQ =20 /* diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4de24eefe661..8c8960245ec0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -844,6 +844,8 @@ static enum hrtimer_restart hrtick(struct hrtimer *time= r) =20 WARN_ON_ONCE(cpu_of(rq) !=3D smp_processor_id()); =20 + rseq_delay_resched_tick(); + rq_lock(rq, &rf); update_rq_clock(rq); rq->donor->sched_class->task_tick(rq, rq->curr, 1); @@ -917,6 +919,16 @@ void hrtick_start(struct rq *rq, u64 delay) =20 #endif /* CONFIG_SMP */ =20 +void hrtick_local_start(u64 delay) +{ + struct rq *rq =3D this_rq(); + struct rq_flags rf; + + rq_lock(rq, &rf); + hrtick_start(rq, delay); + rq_unlock(rq, &rf); +} + static void hrtick_rq_init(struct rq *rq) { #ifdef CONFIG_SMP @@ -6722,6 +6734,8 @@ static void __sched notrace __schedule(int sched_mode) picked: clear_tsk_need_resched(prev); clear_preempt_need_resched(); + if (IS_ENABLED(CONFIG_RSEQ)) + prev->sched_time_delay =3D 0; rq->last_seen_need_resched_ns =3D 0; =20 is_switch =3D prev !=3D next; diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index cd38f4e9899d..1b2b64fe0fb1 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -1378,6 +1378,11 @@ static void do_sched_yield(void) */ SYSCALL_DEFINE0(sched_yield) { + if (IS_ENABLED(CONFIG_RSEQ) && current->sched_time_delay) { + schedule(); + return 0; + } + do_sched_yield(); return 0; } --=20 2.43.5 From nobody Fri Dec 19 06:32:57 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE3E11EB1B5 for ; Tue, 13 May 2025 21:46:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172795; cv=none; b=YKEcBplCV1mDjOMW3Vj2R1kXwV6dZ2naMRlPDJhgV2JMOFB2fhkg+L2Hpwa+2nUarlwXhT6z/W1BtGmQKQ1PxfIG+XDwA+mfhbOCmQ5AlyNgd3AFxuwxTy9AwxVZq6NLfVcISKx7j4BK2OQliwfB5AmcbaGc+IUc6sp31y6jEik= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172795; c=relaxed/simple; bh=Hvp3nU+HjLeWs761iA7FXQjwIa2uP2Ctm8FYDl4ullI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nSzTzXW3nbeUhpye+RugxSf+8xhjrsZpiDba2L5Ht22FtLdnhZpGO1JsFlt1HrlOdTkSu/HXGOcqxXBpPsY4Y5k6q4nlxWQtcV94QAVUO4PdtV1Zg5H15gg1UmnF/nNKH6xWCmOoejkCJqIbJ9oj5sAOGzMb5mUAzw5nS2Vqsq8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=WVJGRlxK; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="WVJGRlxK" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0lIf021616; Tue, 13 May 2025 21:46:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=ynYPe KlKONZVtL6WidNA+27i1VPQsBZGopRVJY9HXyw=; b=WVJGRlxKbdSDskKRYipXs Bi/lZ+Bn8bYGfuQlBm4j5d+3RpO+pbTJgvwhcNME9Ujy4/lchOh0nDOxsD2UObLK /hN7wyBwzsCCJGODoyThH5kqQn7gELAOgSntLameMRX1E2g/67umyU60FzGJr3gR b2b5851yY0N71oh5YLnoHlvwnP7q1XrQjPofajZeX1YkQxu6gsDw1YEq28s7Hp5X P3i72yhYrxDvbdjEEsA+GSrFRKFR9I3D4JJFWKViRZuP/kRuEBkZc60bHqpxkA+w 2pZpeCrZBY1sQGvAS7sWozFuUteWm7mH2A/4k4vdxKdLkRQYjODEFTdNZX+OMfnt g== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbcmgat0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:46:00 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeN001890; Tue, 13 May 2025 21:45:59 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:45:59 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 2/6] Sched: Indicate if thread got rescheduled Date: Tue, 13 May 2025 21:45:50 +0000 Message-ID: <20250513214554.4160454-3-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-ORIG-GUID: HVbylk16ERiHEGxsuCkMG3AF2jRQuj0K X-Authority-Analysis: v=2.4 cv=f+RIBPyM c=1 sm=1 tr=0 ts=6823bd98 b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=yPCof4ZbAAAA:8 a=O1hS_E6ypt93dt2FiYsA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfX+L7MOWJvFTgK Htfz7MD9r/885+fud1RFh4atJaju49pihhNj+HrUB44zryHCYuEo9UseF9/YZp6NpydMpHXkxxL HR8+fHNs1mnxb3+WRkezzP8hrTq2sh2966ACTUAitchHrGmwaL+/R6adn1hURhnK+QSNDVwkbjq S9rUubaENnAYLmPTjMLeDx0h+9pw2mED/oiMQIe5QVBC+t2Wc4q+K240I4D5tHkH7ADp02arguR SMdxDGCgJ6JbIU1Wdr/NX0v7DWM3mdBVMMmxIosQvUcKFx46Fj5bPbeqNuutZ+fdw3Vdc4qcqMh +YgIh2KkYgg83dUeqmjnyInE13u4U6bpK/CLlp6dh2TnoLXwsLmEFnurY0PzKgenur0U+aH7Ugt 8Xism+J1TVp81X04tBBcK4TAiwekiH4kiRlJzau6oDINvLSw0k6cyRLmCFoAd3Hd6PsrC9mr X-Proofpoint-GUID: HVbylk16ERiHEGxsuCkMG3AF2jRQuj0K Content-Type: text/plain; charset="utf-8" Use a bit in rseq flags to indicate if the thread got rescheduled after the cpu time extension was graned. The user thread can check this flag before calling sched_yield() to yield the cpu. Signed-off-by: Prakash Sangappa --- include/linux/sched.h | 2 ++ include/uapi/linux/rseq.h | 10 ++++++++++ kernel/rseq.c | 20 ++++++++++++++++++++ kernel/sched/core.c | 3 +-- 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 14bf0508bfca..71e6c8221c1e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2256,12 +2256,14 @@ unsigned long sched_cpu_util(int cpu); extern bool rseq_delay_resched(void); extern void rseq_delay_resched_fini(void); extern void rseq_delay_resched_tick(void); +extern void rseq_delay_schedule(void); =20 #else =20 static inline bool rseq_delay_resched(void) { return false; } static inline void rseq_delay_resched_fini(void) { } static inline void rseq_delay_resched_tick(void) { } +static inline void rseq_delay_schedule(void) { } =20 #endif =20 diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index 25fc636b17d5..f4813d931387 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -27,6 +27,7 @@ enum rseq_cs_flags_bit { RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT =3D 1, RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT =3D 2, RSEQ_CS_FLAG_DELAY_RESCHED_BIT =3D 3, + RSEQ_CS_FLAG_RESCHEDULED_BIT =3D 4, }; =20 enum rseq_cs_flags { @@ -38,6 +39,9 @@ enum rseq_cs_flags { (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), RSEQ_CS_FLAG_DELAY_RESCHED =3D (1U << RSEQ_CS_FLAG_DELAY_RESCHED_BIT), + RSEQ_CS_FLAG_RESCHEDULED =3D + (1U << RSEQ_CS_FLAG_RESCHEDULED_BIT), + }; =20 /* @@ -135,6 +139,12 @@ struct rseq { * Request by user thread to delay preemption. With use * of a timer, kernel grants extra cpu time upto 30us for this * thread before being rescheduled. + * - RSEQ_CS_FLAG_RESCHEDULED + * Set by kernel if the thread was rescheduled in the extra time + * granted due to request RSEQ_CS_DELAY_RESCHED. This bit is + * checked by the thread before calling sched_yield() to yield + * cpu. User thread sets this bit to 0, when setting + * RSEQ_CS_DELAY_RESCHED to request preemption delay. */ __u32 flags; =20 diff --git a/kernel/rseq.c b/kernel/rseq.c index dba44ca9f624..9355654e9b38 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -504,6 +504,26 @@ void rseq_delay_resched_tick(void) #endif } =20 +void rseq_delay_schedule(void) +{ +#ifdef CONFIG_SCHED_HRTICK + struct task_struct *t =3D current; + u32 flags; + + if (t->sched_time_delay) { + t->sched_time_delay =3D 0; + if (!t->rseq) + return; + if (copy_from_user_nofault(&flags, &t->rseq->flags, + sizeof(flags))) + return; + flags |=3D RSEQ_CS_FLAG_RESCHEDULED; + copy_to_user_nofault(&t->rseq->flags, &flags, + sizeof(flags)); + } +#endif +} + #ifdef CONFIG_DEBUG_RSEQ =20 /* diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8c8960245ec0..86583fb72914 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6734,8 +6734,7 @@ static void __sched notrace __schedule(int sched_mode) picked: clear_tsk_need_resched(prev); clear_preempt_need_resched(); - if (IS_ENABLED(CONFIG_RSEQ)) - prev->sched_time_delay =3D 0; + rseq_delay_schedule(); rq->last_seen_need_resched_ns =3D 0; =20 is_switch =3D prev !=3D next; --=20 2.43.5 From nobody Fri Dec 19 06:32:57 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A1B52046BA for ; Tue, 13 May 2025 21:46:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172799; cv=none; b=e/fZk/Z2dnmVizkkZCUaHzs7lxl7R1em4HqwGpze50BK91yhUraEBYk/VpkFCAezLa4hKxXMfV9waafQgNE+3l4CJZbxPpAlkzhmQHvaq2URsbAgN37G22dLhGrxaxW8ipyPiS4r1z+u3xEdq/qsyJTs8qndLfNCaDhctytIXgE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172799; c=relaxed/simple; bh=7B12a5hYQjTTeBFzhXliq7ihaUcje+OiP8Og9NOsTDI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dOW0rnn6cXWLeWPjzZGbXbHYw50C/YnrcozADxZ5hOj6FOb0fdfkaA2NQ8AVaNhqUW94jUIiCrj58zTJOlYOoGztLBuFP625My5pDg25Qzt5uKgJoQTkNijw8BlDRQcyRde2Oz4ls95hS9rOC8c5lSNfCkHSP6TsB1I3GIm5Qkw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=r+MTpHuw; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="r+MTpHuw" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0mF9021629; Tue, 13 May 2025 21:46:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=IQcSc IcYTX4XkSixb+3nmgGZb89xJcUaF8n6HcenWkA=; b=r+MTpHuwCWfBhZog8yL4j rlatUMnoQ3iAVT4AAS0NZoCzbgUvxilNOYU3x/sEo/8OGmlzchlx3C0OrsfLM6Jh EzD8C5DEj7EYm24244uXr7CKAppGsZNJVhiBF2YH8WaEL1gy3RbHgJhkUM3eCzkW s45YtTekYyRESAgPWDa6QnY4LkgFoqLysWPqrkz/duPt8OXsv1JKZL3+EBegshwW OrGwaJl3IeT56vMBqVsqcr3RpWItyr3k2VoqSRvMxtLOCQ4NBygf6eRr8jGubd3q ozNcQUxDsDYDfOJnsjouQy4iufu3xeSRZGgIT2UKxEFmxIfx4UHm+Z0SKi3SS+s+ A== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbcmgat1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:46:01 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeO001890; Tue, 13 May 2025 21:46:00 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:46:00 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 3/6] Sched: Tunable to specify duration of time slice extension Date: Tue, 13 May 2025 21:45:51 +0000 Message-ID: <20250513214554.4160454-4-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-ORIG-GUID: mXJrKalMY_m117X5kYrTYXNgdqFem5Ol X-Authority-Analysis: v=2.4 cv=f+RIBPyM c=1 sm=1 tr=0 ts=6823bd99 b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=yPCof4ZbAAAA:8 a=ggT4wvkgqP9MO6PAK6EA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfXz1PHrWSxq74p 0LFA4kFFbSs46AR/CBEiSH0qnU7OGWDvSFE0a7WdyaeVYXVMq0Nc5R1fXnG0tt8++U2LF9ZRTQO 6ppsYdAdqxNfqbgWQaqdMCkGm6zeu4pgFWRA8rkP4xrA473hDUK1mxZBE4eOLgdMbcibnb3KCRr U4vTiOZGwdZTXZlETDOJzeUoqKzdO5x9qLfD6bKSD0+aKWQHRldaVdxOvb4mv/N4Q3DBmMuXmll 0/emYvmprg4BOl4nxYWLLP7bLBxUDB7VYFAmAzH0qwadVGlvcOxvOBMkN70X9RjWDrcyMcAoBq0 z+SX4yvi8AO1gD5ybhdD4B7d7/WjgIVG8T+39T0w5XqepJ/fGOKjF/sgfU5b4dVta3nEs9SkDvU 0049n0R5yG1jhLOIPHmETL4oGb/Gvr2mfyr4UT5ooqcsPR/H6ElaP6CrO+vazZleV/uKRigG X-Proofpoint-GUID: mXJrKalMY_m117X5kYrTYXNgdqFem5Ol Content-Type: text/plain; charset="utf-8" Add a tunable to specify duration of scheduler time slice extension. The default will be set to 30us and the max value that can be specified is 100us. Setting it to 0, disables scheduler time slice extension. Signed-off-by: Prakash Sangappa --- include/linux/sched.h | 3 +++ include/uapi/linux/rseq.h | 5 +++-- kernel/rseq.c | 7 +++++-- kernel/sched/core.c | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 43 insertions(+), 4 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 71e6c8221c1e..c279232ca6a2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -407,6 +407,9 @@ static inline void sched_domains_mutex_lock(void) { } static inline void sched_domains_mutex_unlock(void) { } #endif =20 +/* Scheduler time slice extension */ +extern unsigned int sysctl_sched_preempt_delay_us; + struct sched_param { int sched_priority; }; diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index f4813d931387..015534f064af 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -137,8 +137,9 @@ struct rseq { * this thread. * - RSEQ_CS_FLAG_DELAY_RESCHED * Request by user thread to delay preemption. With use - * of a timer, kernel grants extra cpu time upto 30us for this - * thread before being rescheduled. + * of a timer, kernel grants extra cpu time upto the tunable + * 'sched_preempt_delay_us' value for this thread before it gets + * rescheduled. * - RSEQ_CS_FLAG_RESCHEDULED * Set by kernel if the thread was rescheduled in the extra time * granted due to request RSEQ_CS_DELAY_RESCHED. This bit is diff --git a/kernel/rseq.c b/kernel/rseq.c index 9355654e9b38..44d0f3ae0cd3 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -456,6 +456,8 @@ bool rseq_delay_resched(void) if (!IS_ENABLED(CONFIG_SCHED_HRTICK)) return false; =20 + if (!sysctl_sched_preempt_delay_us) + return false; if (!t->rseq) return false; =20 @@ -489,8 +491,9 @@ void rseq_delay_resched_fini(void) * If your critical section is longer than 30 us you get to keep the * pieces. */ - if (t->sched_time_delay) - hrtick_local_start(30 * NSEC_PER_USEC); + if (sysctl_sched_preempt_delay_us && t->sched_time_delay) + hrtick_local_start(sysctl_sched_preempt_delay_us * + NSEC_PER_USEC); #endif } =20 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 86583fb72914..31928cbcd907 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -148,6 +148,15 @@ __read_mostly int sysctl_resched_latency_warn_once =3D= 1; */ __read_mostly unsigned int sysctl_sched_nr_migrate =3D SCHED_NR_MIGRATE_BR= EAK; =20 +/* + * Scheduler time slice extension, duration in microsecs. + * Max value allowed 100us, default is 30us. + * If set to 0, scheduler time slice extension is disabled. + */ +#define SCHED_PREEMPT_DELAY_DEFAULT_US 30 +__read_mostly unsigned int sysctl_sched_preempt_delay_us =3D + SCHED_PREEMPT_DELAY_DEFAULT_US; + __read_mostly int scheduler_running; =20 #ifdef CONFIG_SCHED_CORE @@ -4664,6 +4673,20 @@ static int sysctl_schedstats(const struct ctl_table = *table, int write, void *buf #endif /* CONFIG_PROC_SYSCTL */ #endif /* CONFIG_SCHEDSTATS */ =20 +static int sysctl_sched_preempt_delay(const struct ctl_table *table, int w= rite, + void *buffer, size_t *lenp, loff_t *ppos) +{ + int err; + + err =3D proc_dointvec_minmax(table, write, buffer, lenp, ppos); + if (err < 0) + return err; + if (sysctl_sched_preempt_delay_us > SCHED_PREEMPT_DELAY_DEFAULT_US) + pr_warn("Sched preemption delay time set higher then default value %d us= \n", + SCHED_PREEMPT_DELAY_DEFAULT_US); + return err; +} + #ifdef CONFIG_SYSCTL static const struct ctl_table sched_core_sysctls[] =3D { #ifdef CONFIG_SCHEDSTATS @@ -4711,6 +4734,15 @@ static const struct ctl_table sched_core_sysctls[] = =3D { .extra2 =3D SYSCTL_FOUR, }, #endif /* CONFIG_NUMA_BALANCING */ + { + .procname =3D "sched_preempt_delay_us", + .data =3D &sysctl_sched_preempt_delay_us, + .maxlen =3D sizeof(unsigned int), + .mode =3D 0644, + .proc_handler =3D sysctl_sched_preempt_delay, + .extra1 =3D SYSCTL_ZERO, + .extra2 =3D SYSCTL_ONE_HUNDRED, + }, }; static int __init sched_core_sysctl_init(void) { --=20 2.43.5 From nobody Fri Dec 19 06:32:57 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BE7C20C484 for ; Tue, 13 May 2025 21:46:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172797; cv=none; b=mweZMtMe9ojRrf1+f+f4pcEn89UnrJ00nm/k4OozNnNZ9suckBFzz0fn/7rgR3pIW/Klqp55rptx47sFz2SvmdPtoRV6BX14HPcqQuWOHXBE5UwCxic9AwVuWGgN1BXMosFh2XuF3lubVqj1RF5NZRQvd9AVyKS7qw342bp9lgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172797; c=relaxed/simple; bh=dMxkB/Amgm39mRKWuYEQ2bBhP+VIS7OEzCC03PxhBq8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QpZYeLiI+M1XbDJPIU6JdUCtORB7oU7d7M5NfJKpFALn6bybMprrzqbw9I+OVV+h1VBhYGsqhv1HulSg6R1ftOlPs8+fLkKCfvgHAhQqxOHmDeMsByHzuAjW6WNK24AtfERthkLvZqf+pa59Uvk4dIx0CfT6YqHhx1Rt6OndN0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=DrfMPayQ; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="DrfMPayQ" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0l8g021615; Tue, 13 May 2025 21:46:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=FeIq8 56bcHjC00EZD3ha87tR0WyB1diqHSx9HB1jnDo=; b=DrfMPayQaS1KU3esxNRju kebAN0YVdagIUH3K1apgiCalehATk77seR1WJRei7dsDj5MlQMeS+s+/r17rG/sa bXn1mBFQveUNN4iDru0j64pzxsHBzo37+NACgdk/BFzEe3Q+ivTvT28Mh8798i5p N1AAk+L5wSECa1K1zUH0DFsaoddIf4hM6Ndpk7DUnN5AQr/bn7PP2/y0JcjwldkW hYs9E4sgebdCqsE7VeeYrrZgyOYLeyeKJ/I0hgIT/NdDO3Y1lptSOUYjQ//YU1Yj 03ubYbfIta8SzC72gy8SIcI1reFqgmT3JwuDy2bBSQI4vrC38N4JH7zMKWphmbkk w== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbcmgat2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:46:02 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeP001890; Tue, 13 May 2025 21:46:02 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:46:02 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 4/6] Sched: Add scheduler stat for cpu time slice extension Date: Tue, 13 May 2025 21:45:52 +0000 Message-ID: <20250513214554.4160454-5-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-ORIG-GUID: Ifzoe-T7iBWx5m7dX-pzDVY50XpyzFdP X-Authority-Analysis: v=2.4 cv=f+RIBPyM c=1 sm=1 tr=0 ts=6823bd9a b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=yPCof4ZbAAAA:8 a=HM_RregbEbY-m7O0KLoA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfX+0AVgNjn0+co sxCD2insB0oBCR0H/PZ0kZ3Z145UpzYypC5ni8QAyk+ZQOO0s1xozja2y00j/7MrEKSmuPlA/V+ 7euUIMRx/+PD2xl6g+V4qNw5HUopO4ImSqNygXAseJsymlGYEKXKm9X0h9bXC2u4SrViBIpdV1k Wf590N5z9SFX869UrQ/PErJPSRVDSlTkdb0nu6/3giqr77sySnG10FZKAEbK7SBDNZyYMgAwHKC gaLTxTws1F7JWYwu30pT2yVpA+CZTMhL7pMr7MmoRUvEqrGV4WVSThfn1tR33PQS7dmAg/cEzqz vNmgdBLbftloRsmvl4gPsNBfIKjyHIGFTTf20lAF8csli87EmTOZdXVzmogKcoa0ahPJu91a6sn 1qrla3EmUKg2XWvoh4BKOpK+R5VbNIVzWCUu0pMU0AOXG7Q/X8KhBu7BkLxvRPT5uj8ee06y X-Proofpoint-GUID: Ifzoe-T7iBWx5m7dX-pzDVY50XpyzFdP Content-Type: text/plain; charset="utf-8" Add scheduler stat to record number of times the thread was granted cpu time slice extension. Signed-off-by: Prakash Sangappa --- include/linux/sched.h | 2 ++ kernel/rseq.c | 1 + kernel/sched/core.c | 5 +++++ kernel/sched/debug.c | 1 + 4 files changed, 9 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index c279232ca6a2..8cf756e80ae9 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -340,6 +340,7 @@ extern void io_schedule_finish(int token); extern long io_schedule_timeout(long timeout); extern void io_schedule(void); extern void hrtick_local_start(u64 delay); +extern void update_stat_preempt_delayed(struct task_struct *t); =20 /* wrapper function to trace from this header file */ DECLARE_TRACEPOINT(sched_set_state_tp); @@ -563,6 +564,7 @@ struct sched_statistics { u64 nr_wakeups_affine_attempts; u64 nr_wakeups_passive; u64 nr_wakeups_idle; + u64 nr_preempt_delay_granted; =20 #ifdef CONFIG_SCHED_CORE u64 core_forceidle_sum; diff --git a/kernel/rseq.c b/kernel/rseq.c index 44d0f3ae0cd3..c4bc52f8ba9c 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -475,6 +475,7 @@ bool rseq_delay_resched(void) return false; =20 t->sched_time_delay =3D 1; + update_stat_preempt_delayed(t); =20 return true; } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 31928cbcd907..880368756b48 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -938,6 +938,11 @@ void hrtick_local_start(u64 delay) rq_unlock(rq, &rf); } =20 +void update_stat_preempt_delayed(struct task_struct *t) +{ + schedstat_inc(t->stats.nr_preempt_delay_granted); +} + static void hrtick_rq_init(struct rq *rq) { #ifdef CONFIG_SMP diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 4cba21f5d24d..6b753f56c312 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -1216,6 +1216,7 @@ void proc_sched_show_task(struct task_struct *p, stru= ct pid_namespace *ns, P_SCHEDSTAT(nr_wakeups_affine_attempts); P_SCHEDSTAT(nr_wakeups_passive); P_SCHEDSTAT(nr_wakeups_idle); + P_SCHEDSTAT(nr_preempt_delay_granted); =20 avg_atom =3D p->se.sum_exec_runtime; if (nr_switches) --=20 2.43.5 From nobody Fri Dec 19 06:32:57 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 460032581 for ; Tue, 13 May 2025 21:46:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172780; cv=none; b=WhyqMqbRf16lafsbnRfRgK6KrFeCJ2YTWDiko9Ck6qv/Y0vlT2wajV/c/62lDr2Qm+ix80eI6gjy/RxORoBrL3Xgv6TSjZ6smnfLYUD46PjkyiuECQmPRitSP7k+WFdre/tmhUhGk/JOq8lkw7U82rhpURfFO2Y37EEspeUWI9M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172780; c=relaxed/simple; bh=RwqSclDMY6GfCaZEU5c53IrFIXQSlzkanBN9AoDM/s4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mnVxlAJzqZatFdrpoNRQbqz+xL/EZphkxmdaFBCOexl9ABgTI6KVY+glJBFH+eKBgzemiEAeumWrm/igWMl2T4Y2ZNpqLD1YN5NocCjqYP7lve/3BiMnxh+YF1ghTsDDwYEUopu51W5avW98Stlk88+xZxL+c9HJRWEAz1jBBXM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=TeIUTFnA; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="TeIUTFnA" Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0lgR010297; Tue, 13 May 2025 21:46:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=fP7NQ M9rSpI6g19shnrw/5oiAfJEIKQ6mCauSx8K+74=; b=TeIUTFnAa4fihtPnBzw4k zej5OSsuWg8sOSD/IY6SGTXhu/W/WtUH65xAIzUy1peNVP5B0VoM/c+8XICSIZb2 8NVCH0L6j1zNA/ii9qTZD66ZS5g+pnvWgRXTmW7xtINGY/htlsw0PPjIi4yNeQBc iRzDcS4qbmAqBvnWn5Qj05MyjAdDiXui1g8Jw2gjUH1FKaTxK2EdexOdDbUonpe0 gQWhWN/4ZHl8ffjqzgGH7d3P2371i7grD8Z7dBc9VetjtjKYQVN1JjOX9bOn3mXr GfDovKwLANo/k06XUiJ+W1aSRgBwPT6o8MU3zxTPKi12G4aKKbsjQ+7HDknF7n8t Q== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbchrarb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:46:04 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeR001890; Tue, 13 May 2025 21:46:03 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:46:03 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 5/6] Sched: Add tracepoint for sched time slice extension Date: Tue, 13 May 2025 21:45:53 +0000 Message-ID: <20250513214554.4160454-6-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfX2f/em3+ZqP// U7ufb+K01K5tHIdI8xqcF4ZLUm/VzOoTqPZxScFbWQFs33VGNz3cx7y14Kjshak05PjF9E98hhO SPI1ZxcwenGld0oMD9PwWcUsgFEln1LvUWPK7ixiCOBUbOoQaPVUt1LU8kOCy0uAMUHx9+08iYV yaO/AnGFaRq+QffxTUO3sFgtlaBvs+98Uvrx56Xk2ALvoQPf30jtsQF/X9797OmjqmcaKc1cCjQ f5EkGPVafKnDOU+++hqtta+XSMA7v40QvGGR+O2aA/4TDdLWH1MNMngS4nOKpx2cBw8RkNNeRYQ bwbRqokjxBkdkgysE1hhykTUqd/cFU1wzPkMhLYkcTxr12SSzL01Omh/M7DVc4KrcJ24h4w7dkL CkzjoUhvexOPQZGUCE+Va9QGnaadQPk+FuGaf0Tz8eiQXfSjY6wGmQ53U54B28VieCO5HgkC X-Authority-Analysis: v=2.4 cv=EtTSrTcA c=1 sm=1 tr=0 ts=6823bd9c b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=yPCof4ZbAAAA:8 a=-a2y3q_jHnAO8POPY3MA:9 X-Proofpoint-ORIG-GUID: xosSJFlI2glUxAv-KhiTv65ZSuml8sPi X-Proofpoint-GUID: xosSJFlI2glUxAv-KhiTv65ZSuml8sPi Content-Type: text/plain; charset="utf-8" Trace thread's preemption getting delayed. Which can occur if the running thread requested extra time on cpu. Also, indicate the NEED_RESCHED flag, that is set on the thread, getting cleared. Suggested-by: Sebastian Andrzej Siewior Signed-off-by: Prakash Sangappa --- include/trace/events/sched.h | 28 ++++++++++++++++++++++++++++ kernel/entry/common.c | 12 ++++++++++-- 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index 8994e97d86c1..4aa04044b14a 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -296,6 +296,34 @@ TRACE_EVENT(sched_migrate_task, __entry->orig_cpu, __entry->dest_cpu) ); =20 +/* + * Tracepoint for delayed resched requested by task: + */ +TRACE_EVENT(sched_delay_resched, + + TP_PROTO(struct task_struct *p, unsigned int resched_flg), + + TP_ARGS(p, resched_flg), + + TP_STRUCT__entry( + __array( char, comm, TASK_COMM_LEN ) + __field( pid_t, pid ) + __field( int, cpu ) + __field( int, flg ) + ), + + TP_fast_assign( + memcpy(__entry->comm, p->comm, TASK_COMM_LEN); + __entry->pid =3D p->pid; + __entry->cpu =3D task_cpu(p); + __entry->flg =3D resched_flg; + ), + + TP_printk("comm=3D%s pid=3D%d cpu=3D%d resched_flg_cleared=3D0x%x", + __entry->comm, __entry->pid, __entry->cpu, __entry->flg) + +); + DECLARE_EVENT_CLASS(sched_process_template, =20 TP_PROTO(struct task_struct *p), diff --git a/kernel/entry/common.c b/kernel/entry/common.c index b26adccb32df..cd0f076920fd 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -12,6 +12,7 @@ =20 #include "common.h" =20 +#include #define CREATE_TRACE_POINTS #include =20 @@ -91,6 +92,7 @@ __always_inline unsigned long exit_to_user_mode_loop(stru= ct pt_regs *regs, unsigned long ti_work, bool irq) { + unsigned long ti_work_cleared =3D 0; /* * Before returning to user space ensure that all pending work * items have been completed. @@ -100,10 +102,12 @@ __always_inline unsigned long exit_to_user_mode_loop(= struct pt_regs *regs, local_irq_enable_exit_to_user(ti_work); =20 if (ti_work & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) { - if (irq && rseq_delay_resched()) + if (irq && rseq_delay_resched()) { clear_tsk_need_resched(current); - else + ti_work_cleared =3D ti_work; + } else { schedule(); + } } =20 if (ti_work & _TIF_UPROBE) @@ -134,6 +138,10 @@ __always_inline unsigned long exit_to_user_mode_loop(s= truct pt_regs *regs, ti_work =3D read_thread_flags(); } =20 + if (ti_work_cleared) + trace_sched_delay_resched(current, ti_work_cleared & + (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)); + /* Return the latest work state for arch_exit_to_user_mode() */ return ti_work; } --=20 2.43.5 From nobody Fri Dec 19 06:32:57 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B7191E5B93 for ; Tue, 13 May 2025 21:46:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172782; cv=none; b=qHX7EAbrStAAzxNQyWcnB1hMklUm3K4WeGwFTpnh5RXlIedO27xTJP778HiZ6SSso7BKUGRqF+8uzptCjCTl6+MymovbP43MelmHk7imWA6/1s9aYRwgmOEVbblczOxKN+jXbTTS52lVlO0doJFLoGvhuZnONsGCCbBhTQt2rRc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747172782; c=relaxed/simple; bh=8oSgZqYJ7fyvLQLyfkymRSnOWIb3CUWExGI3AAn9z3I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=a56jAS756y+WVjxKbvIQ79doDS3pwN3yM7ljVLaK6mX8462tiOMx8RYv4khmER/BSPoBEUJtLbTaEZANE2G/HUKdaO4SUSdUBkI0dBotgMyDJM8nZpXmksmND8QlqB/1AvDFclBKYlqdgoAFwV3N5khQEXQgui/L/XqfYY4lp4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=fa3vkfXw; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="fa3vkfXw" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54DL0mCB008956; Tue, 13 May 2025 21:46:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=J7Oxn v8gc8VK5Due4buy0qqRPwymDwfCK+r62ACGAIU=; b=fa3vkfXwYYnfSnWUWAGu3 YaMfhqn8gRaV7aTERCbtZPLD0dNc0V8+fpIxqGzpYxKMmYSQekJn36H2X6uivYW0 Hnk+RTGpkfThIt9p0sOD7GZhx5LIvA6wBTd8peVBwYfCTHib7pDHMDXdTnWTeBQ+ epovwiP5AenBYbEqB78FVtAUzRa9s1KyCss9dgtku6n9PVqfGerbhxj86C0R/HIL u2s/868BZHLdHDFNS2K6XYCPCgyrTKI6sXWOr63+7OHv86ibOCG5+CY7zyZX7tk6 Cw1wTOigt7uapNVL/1gvyJ4pQmtjyhiqdmPNrUW0NilmNU/5jKi9M6u8VsbntxYa A== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 46mbchgbbr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2025 21:46:05 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 54DLQjeS001890; Tue, 13 May 2025 21:46:04 GMT Received: from psang-work.osdevelopmeniad.oraclevcn.com (psang-work.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.253.35]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 46mc4yv943-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 May 2025 21:46:04 +0000 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, bigeasy@linutronix.de, kprateek.nayak@amd.com Subject: [PATCH V4 6/6] Add API to query supported rseq cs flags Date: Tue, 13 May 2025 21:45:54 +0000 Message-ID: <20250513214554.4160454-7-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250513214554.4160454-1-prakash.sangappa@oracle.com> References: <20250513214554.4160454-1-prakash.sangappa@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-13_03,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505070000 definitions=main-2505130206 X-Proofpoint-GUID: Swp99mSf9_JrBTggQ06xahzA2TBUG436 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEzMDIwNSBTYWx0ZWRfX0M3GRc9T3lU7 I89qYSuTzkoKywtdRVtRU2q470RMploR2FWdSoH2JDdHpQDiNF3pWj657OyjfUFqvxCkKLQ5BNh LGDFZY9P4JR91HJW7XdEYvDeCAfLSSdsEYd1XxIG0KZSnAsfDgo/oeaEYs7fcOUWcW+2DOer3+J BQH0G3JVKQHlNYB9QYsRqxVdat/logzrgzYaRG4pguoEmjtEc9M+f5qm6F5fWFVngVvP/dvrM3K vDr37pBJTj2M7MPEcew6PQyoID9xngR3U36L4Qef0TtDWb6jSl06J+RZFXyza2ZOMldEbisIhSF 66E20wGgOrOjx4YiXOZMG0dnnvpm1Bc634aTyI3w+TXMCorPNJu3T0wS6Vta0Z+cFIU211aj5G6 YEBMvryvswEJnB9Ji6xT0Mb9DEEu4r5H/QaDKrHDydyBY+0sjBlr6JP5NH4dp8UAlHna7Fje X-Proofpoint-ORIG-GUID: Swp99mSf9_JrBTggQ06xahzA2TBUG436 X-Authority-Analysis: v=2.4 cv=Da8XqutW c=1 sm=1 tr=0 ts=6823bd9d b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=dt9VzEwgFbYA:10 a=7d_E57ReAAAA:8 a=yPCof4ZbAAAA:8 a=lNSuT_yr9ip2OW1UfLwA:9 a=jhqOcbufqs7Y1TYCrUUU:22 Content-Type: text/plain; charset="utf-8" For the API, add a new flag to sys_rseq 'flags' argument called RSEQ_FLAG_QUERY_CS_FLAGS. When this flag is passed it returns a bit mask of all the supported rseq cs flags in the user provided rseq struct's 'flags' member. Suggested-by: Mathieu Desnoyers Signed-off-by: Prakash Sangappa --- include/uapi/linux/rseq.h | 1 + kernel/rseq.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index 015534f064af..44baea9dd10a 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -20,6 +20,7 @@ enum rseq_cpu_id_state { =20 enum rseq_flags { RSEQ_FLAG_UNREGISTER =3D (1 << 0), + RSEQ_FLAG_QUERY_CS_FLAGS =3D (1 << 1), }; =20 enum rseq_cs_flags_bit { diff --git a/kernel/rseq.c b/kernel/rseq.c index c4bc52f8ba9c..997f7ca722ca 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -576,6 +576,23 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,= rseq_len, return 0; } =20 + /* + * return supported rseq_cs flags + * It is an or of all the rseq_cs_flags; + */ + if (flags & RSEQ_FLAG_QUERY_CS_FLAGS) { + u32 rseq_csflags =3D RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT | + RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL | + RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE | + RSEQ_CS_FLAG_DELAY_RESCHED | + RSEQ_CS_FLAG_RESCHEDULED; + if (!rseq) + return -EINVAL; + if (copy_to_user(&rseq->flags, &rseq_csflags, sizeof(u32))) + return -EFAULT; + return 0; + } + if (unlikely(flags)) return -EINVAL; =20 --=20 2.43.5