From nobody Thu Apr 2 20:28:06 2026 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A3463FB7C9; Thu, 26 Mar 2026 15:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; cv=none; b=JujF8v7un8dKLgNncZb7B5scS5aRVdnAOkQ6vY+2oqxkF3M7vj8Jjg8+nGcvyms7aRulH06PMJZC11zcNNdKpBQb7I6r/P1q074rrxIJnsQo6zEB4rFkHWERAOsxTjNjUIeM3lfNdx2r0aHo7jPCyIvFk9h7ZFyPt5Lk7nY5zQ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; c=relaxed/simple; bh=SyYuQFGWMmZJRxmCbft9EntqThEpsOfEdFb3Oa1hojQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q3vHfrbPnJ3eerYWpzCXWDBFXTgV8sKbx0j+UspQWwKIw187Pe5tXdsqjH34kigyTkRPp9SKEnpjSGUpkrfArjFJbwqe0On8FpMos+Bbwt7HvRxKo6KpvhziOfPBwF6VwymSFHrm4SM/ZvgzbWyFk46+o2phUget1AN7wYx99qk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=Etd0ooBN; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="Etd0ooBN" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id B0B6CBDE0D; Thu, 26 Mar 2026 15:10:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1774537820; bh=q0ftc4L2bIqiqI9mtmvfrlvg+trY+mrARU5jBWKEO6Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=Etd0ooBNxrOQ9j6oatEyOQw9iAoxoBmvkch5uKfyPqxvX3YYTaZNkAFsROVhyc/03 jFcXdR/SkhGHEUFvlFswFCSGvMtct1jFANVvaZs4X+LmWrwLOXx8fb4Rt3aj4qA5FI d0900XPBOCjqfPxEXvlFZQnhEtQLvwXi0da4U4kc= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, Dmitry Ilvokhin Subject: [PATCH v4 1/5] tracing/lock: Remove unnecessary linux/sched.h include Date: Thu, 26 Mar 2026 15:10:00 +0000 Message-ID: <5593ed9718b1a6e4ec51d99772c485734029d4d4.1774536681.git.d@ilvokhin.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" None of the trace events in lock.h reference anything from linux/sched.h. Remove the unnecessary include. Signed-off-by: Dmitry Ilvokhin Acked-by: Usama Arif --- include/trace/events/lock.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h index 8e89baa3775f..da978f2afb45 100644 --- a/include/trace/events/lock.h +++ b/include/trace/events/lock.h @@ -5,7 +5,6 @@ #if !defined(_TRACE_LOCK_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_LOCK_H =20 -#include #include =20 /* flags for lock:contention_begin */ --=20 2.52.0 From nobody Thu Apr 2 20:28:06 2026 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 541D83F9F40; Thu, 26 Mar 2026 15:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538261; cv=none; b=iS9uSqGrybl9ApybBrgCZXdKpfg3vGlSvqatjKRBG5IsaXhMnHzJexxNgrAB1oC14k6Zvulx0Q0wAXE55ZMyVbFP4M7V/2XVl/kzgvJKGAWOOiLoagaoDe6x2ONjJbf+IjGLc0D1Kdmq+DRXiPgQmAx/gDKpVfV/zllel851d3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538261; c=relaxed/simple; bh=nkS2FNHvuD93wdW/3Ky4EndgW9Xd2ntRYc4kP5U6umY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HrlQv1HP3wQEpStw/vhs787H+Lf3M/LFGuMaHiw1+aRVWzpl1Jq6zISDNpBJD3jPrjTmJx+o+fFN+MDs+Utj5cM/upGuvxnG0Vya56SiuFlF0E1rnULeUCP5K+9bzJEwKk/RrCauNF8mFDTJgY2BGzb32Tggstq8YyYAq3zynxY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=MCsPIKTS; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="MCsPIKTS" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 2C262BDE11; Thu, 26 Mar 2026 15:10:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1774537820; bh=Jiq9HHreSKv9gJ1OxNDx05z/LXXiM1SwacfPI7vtSKY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=MCsPIKTSM3jO0DE+O7tGH9or4d7RA3oOYgCViVrViC7aFwieHCibxhbFDn+0u4iUu DxQ5xHMscPKl2VxBrGGVhwNfgbjQ48/yUGPMTAsfUS8AzcUMfc4XCPpvIdDrjm1wi3 9Mcozrt1w+UiGqwn8R8+zaLsj7qE/0Qrbi9uF2aU= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, Dmitry Ilvokhin , Usama Arif Subject: [PATCH v4 2/5] locking/percpu-rwsem: Extract __percpu_up_read() Date: Thu, 26 Mar 2026 15:10:01 +0000 Message-ID: <223dd069af9f3395d0044398e7996a98a8c94e5a.1774536681.git.d@ilvokhin.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move the percpu_up_read() slowpath out of the inline function into a new __percpu_up_read() to avoid binary size increase from adding a tracepoint to an inlined function. Signed-off-by: Dmitry Ilvokhin Acked-by: Usama Arif --- include/linux/percpu-rwsem.h | 15 +++------------ kernel/locking/percpu-rwsem.c | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+), 12 deletions(-) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index c8cb010d655e..39d5bf8e6562 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -107,6 +107,8 @@ static inline bool percpu_down_read_trylock(struct perc= pu_rw_semaphore *sem) return ret; } =20 +extern void __percpu_up_read(struct percpu_rw_semaphore *sem); + static inline void percpu_up_read(struct percpu_rw_semaphore *sem) { rwsem_release(&sem->dep_map, _RET_IP_); @@ -118,18 +120,7 @@ static inline void percpu_up_read(struct percpu_rw_sem= aphore *sem) if (likely(rcu_sync_is_idle(&sem->rss))) { this_cpu_dec(*sem->read_count); } else { - /* - * slowpath; reader will only ever wake a single blocked - * writer. - */ - smp_mb(); /* B matches C */ - /* - * In other words, if they see our decrement (presumably to - * aggregate zero, as that is the only time it matters) they - * will also see our critical section. - */ - this_cpu_dec(*sem->read_count); - rcuwait_wake_up(&sem->writer); + __percpu_up_read(sem); } preempt_enable(); } diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index ef234469baac..f3ee7a0d6047 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -288,3 +288,21 @@ void percpu_up_write(struct percpu_rw_semaphore *sem) rcu_sync_exit(&sem->rss); } EXPORT_SYMBOL_GPL(percpu_up_write); + +void __percpu_up_read(struct percpu_rw_semaphore *sem) +{ + lockdep_assert_preemption_disabled(); + /* + * slowpath; reader will only ever wake a single blocked + * writer. + */ + smp_mb(); /* B matches C */ + /* + * In other words, if they see our decrement (presumably to + * aggregate zero, as that is the only time it matters) they + * will also see our critical section. + */ + this_cpu_dec(*sem->read_count); + rcuwait_wake_up(&sem->writer); +} +EXPORT_SYMBOL_GPL(__percpu_up_read); --=20 2.52.0 From nobody Thu Apr 2 20:28:06 2026 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45B0439183E; Thu, 26 Mar 2026 15:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; cv=none; b=NgvUWLUiaXIHrAWkr1eYzUveCKtUgw2m5l+cFROU+8zGyEUt5E+iPQuVmRAPbhzE1yzlP1EBOqqF+qAuFk5SCkdjeNnlZWzqFUpgN2Sf9TbYa0U+JlNR+8tJ+t4CfwJEn44lP2e8n33Q6s/R0WtEqQusM5IzvdUozBGpvNEq+50= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; c=relaxed/simple; bh=QuZfeCyIFeR73CU4lcxWQf6j+lQ4UzKQ2ub9B+9Tte4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F4BosFdDacvYnZhJTKhPgTyn0w2euC7/3vTy23ocuAjX5O0Mu14bsAortcDXTSElZ1O3tyGYn2nA7lRsJY5LvpsNKjw7YhxMGnXmnbS7hPEHoWQnaQYVm9f05D9sq3PJHiYM7VklyLvhwcWxhy5IV7ndpc7RGPs+A7tDftDkJHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=CVHp7urw; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="CVHp7urw" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id A15A2BDE15; Thu, 26 Mar 2026 15:10:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1774537821; bh=i/M9PsdD1hvodrj8kym7pAUYAKq2QJMoXOtrpOgGSSk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=CVHp7urwzbN5ASyotqDRqqGd+HBvSD1RzM6rD7/UT4LcaHVhcSPSvqdam9KollFRC bPnWAZOq+H9bCYGM7CSQ1TsjvY8A0RzEooO/Ydq1T5xkNsQk6+u3rSCseIlPRRYBHW +QjOaMxZL31tg/TU0+IvGKyZkz0PskRVs7I9EKoM= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, Dmitry Ilvokhin Subject: [PATCH v4 3/5] locking: Add contended_release tracepoint to sleepable locks Date: Thu, 26 Mar 2026 15:10:02 +0000 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the contended_release trace event. This tracepoint fires on the holder side when a contended lock is released, complementing the existing contention_begin/contention_end tracepoints which fire on the waiter side. This enables correlating lock hold time under contention with waiter events by lock address. Add trace_contended_release() calls to the slowpath unlock paths of sleepable locks: mutex, rtmutex, semaphore, rwsem, percpu-rwsem, and RT-specific rwbase locks. Where possible, trace_contended_release() fires before the lock is released and before the waiter is woken. For some lock types, the tracepoint fires after the release but before the wake. Making the placement consistent across all lock types is not worth the added complexity. For reader/writer locks, the tracepoint fires for every reader releasing while a writer is waiting, not only for the last reader. Signed-off-by: Dmitry Ilvokhin --- include/trace/events/lock.h | 17 +++++++++++++++++ kernel/locking/mutex.c | 4 ++++ kernel/locking/percpu-rwsem.c | 11 +++++++++++ kernel/locking/rtmutex.c | 1 + kernel/locking/rwbase_rt.c | 6 ++++++ kernel/locking/rwsem.c | 10 ++++++++-- kernel/locking/semaphore.c | 4 ++++ 7 files changed, 51 insertions(+), 2 deletions(-) diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h index da978f2afb45..1ded869cd619 100644 --- a/include/trace/events/lock.h +++ b/include/trace/events/lock.h @@ -137,6 +137,23 @@ TRACE_EVENT(contention_end, TP_printk("%p (ret=3D%d)", __entry->lock_addr, __entry->ret) ); =20 +TRACE_EVENT(contended_release, + + TP_PROTO(void *lock), + + TP_ARGS(lock), + + TP_STRUCT__entry( + __field(void *, lock_addr) + ), + + TP_fast_assign( + __entry->lock_addr =3D lock; + ), + + TP_printk("%p", __entry->lock_addr) +); + #endif /* _TRACE_LOCK_H */ =20 /* This part must be outside protection */ diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 427187ff02db..6c2c9312eb8f 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -997,6 +997,9 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne wake_q_add(&wake_q, next); } =20 + if (trace_contended_release_enabled() && waiter) + trace_contended_release(lock); + if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 @@ -1194,6 +1197,7 @@ EXPORT_SYMBOL(ww_mutex_lock_interruptible); =20 EXPORT_TRACEPOINT_SYMBOL_GPL(contention_begin); EXPORT_TRACEPOINT_SYMBOL_GPL(contention_end); +EXPORT_TRACEPOINT_SYMBOL_GPL(contended_release); =20 /** * atomic_dec_and_mutex_lock - return holding mutex if we dec to 0 diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index f3ee7a0d6047..46b5903989b8 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -263,6 +263,9 @@ void percpu_up_write(struct percpu_rw_semaphore *sem) { rwsem_release(&sem->dep_map, _RET_IP_); =20 + if (trace_contended_release_enabled() && wq_has_sleeper(&sem->waiters)) + trace_contended_release(sem); + /* * Signal the writer is done, no fast path yet. * @@ -292,6 +295,14 @@ EXPORT_SYMBOL_GPL(percpu_up_write); void __percpu_up_read(struct percpu_rw_semaphore *sem) { lockdep_assert_preemption_disabled(); + /* + * After percpu_up_write() completes, rcu_sync_is_idle() can still + * return false during the grace period, forcing readers into this + * slowpath. Only trace when a writer is actually waiting for + * readers to drain. + */ + if (trace_contended_release_enabled() && rcuwait_active(&sem->writer)) + trace_contended_release(sem); /* * slowpath; reader will only ever wake a single blocked * writer. diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index ccaba6148b61..3db8a840b4e8 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1466,6 +1466,7 @@ static void __sched rt_mutex_slowunlock(struct rt_mut= ex_base *lock) raw_spin_lock_irqsave(&lock->wait_lock, flags); } =20 + trace_contended_release(lock); /* * The wakeup next waiter path does not suffer from the above * race. See the comments there. diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c index 82e078c0665a..74da5601018f 100644 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -174,6 +174,8 @@ static void __sched __rwbase_read_unlock(struct rwbase_= rt *rwb, static __always_inline void rwbase_read_unlock(struct rwbase_rt *rwb, unsigned int state) { + if (trace_contended_release_enabled() && rt_mutex_owner(&rwb->rtmutex)) + trace_contended_release(rwb); /* * rwb->readers can only hit 0 when a writer is waiting for the * active readers to leave the critical section. @@ -205,6 +207,8 @@ static inline void rwbase_write_unlock(struct rwbase_rt= *rwb) unsigned long flags; =20 raw_spin_lock_irqsave(&rtm->wait_lock, flags); + if (trace_contended_release_enabled() && rt_mutex_has_waiters(rtm)) + trace_contended_release(rwb); __rwbase_write_unlock(rwb, WRITER_BIAS, flags); } =20 @@ -214,6 +218,8 @@ static inline void rwbase_write_downgrade(struct rwbase= _rt *rwb) unsigned long flags; =20 raw_spin_lock_irqsave(&rtm->wait_lock, flags); + if (trace_contended_release_enabled() && rt_mutex_has_waiters(rtm)) + trace_contended_release(rwb); /* Release it and account current as reader */ __rwbase_write_unlock(rwb, WRITER_BIAS - 1, flags); } diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index bf647097369c..602d5fd3c91a 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -1387,6 +1387,8 @@ static inline void __up_read(struct rw_semaphore *sem) rwsem_clear_reader_owned(sem); tmp =3D atomic_long_add_return_release(-RWSEM_READER_BIAS, &sem->count); DEBUG_RWSEMS_WARN_ON(tmp < 0, sem); + if (trace_contended_release_enabled() && (tmp & RWSEM_FLAG_WAITERS)) + trace_contended_release(sem); if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) =3D=3D RWSEM_FLAG_WAITERS)) { clear_nonspinnable(sem); @@ -1413,8 +1415,10 @@ static inline void __up_write(struct rw_semaphore *s= em) preempt_disable(); rwsem_clear_owner(sem); tmp =3D atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count); - if (unlikely(tmp & RWSEM_FLAG_WAITERS)) + if (unlikely(tmp & RWSEM_FLAG_WAITERS)) { + trace_contended_release(sem); rwsem_wake(sem); + } preempt_enable(); } =20 @@ -1437,8 +1441,10 @@ static inline void __downgrade_write(struct rw_semap= hore *sem) tmp =3D atomic_long_fetch_add_release( -RWSEM_WRITER_LOCKED+RWSEM_READER_BIAS, &sem->count); rwsem_set_reader_owned(sem); - if (tmp & RWSEM_FLAG_WAITERS) + if (tmp & RWSEM_FLAG_WAITERS) { + trace_contended_release(sem); rwsem_downgrade_wake(sem); + } preempt_enable(); } =20 diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c index 74d41433ba13..35ac3498dca5 100644 --- a/kernel/locking/semaphore.c +++ b/kernel/locking/semaphore.c @@ -230,6 +230,10 @@ void __sched up(struct semaphore *sem) sem->count++; else __up(sem, &wake_q); + + if (trace_contended_release_enabled() && !wake_q_empty(&wake_q)) + trace_contended_release(sem); + raw_spin_unlock_irqrestore(&sem->lock, flags); if (!wake_q_empty(&wake_q)) wake_up_q(&wake_q); --=20 2.52.0 From nobody Thu Apr 2 20:28:06 2026 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 459171AB6F1; Thu, 26 Mar 2026 15:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; cv=none; b=l6UoM4RwbSF4o2SEpsYyDCvgdSKu5/f3N86kA19R1iFCdltE+KXzKEdCKb5ET+29mYeHsdAYPrrM9hak4rBlG+qC77f+ND0dCn38OlReS924z6TOreu5rGzghSOvhX/XLMs2ti1MfXwVxBjkItbjV+sLh6AN9P2mzuVMpv/2keo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774538262; c=relaxed/simple; bh=wf/IZX0UahYcXFFt0JEuOfwqMC51WnMMqOlntG+tTdQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pjcAf6Wf5MYaMN+9NLuQSBIdV6pLhYgS4Jgs6X+0A+KC1U974Gcqz42v/F5V7Kb8z0kmxaWZVgzhr/eu8nNoASoKokxZ6WVUTwkj05KAcvOkuUytvE9thzGytRiM6ULHtWC6Ueup3CpZubH8mcNVftC5O1brUmy4744sPi3n/XI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=fS1AnZ/H; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="fS1AnZ/H" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 21B38BDE1A; Thu, 26 Mar 2026 15:10:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1774537821; bh=72ZUqls9X5Zxxy0EgcvU1FGMJtT1EgV5tBxL8foNtm4=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=fS1AnZ/HQlSe70Ax1fOuyRM8x6rg4qhAT4E8gFsscQUcDFKPsnj2VF77IxipYcdq5 tccXuj0olEgX95inon7/ZI7w82wjSIIhdGV9NUfAXVQn+npzwnxxZFKMTKgFVBPLZZ 6ym8V9XEZznruRazOkrIVPFlXWgY/0r9ml7F0dXE= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, Dmitry Ilvokhin Subject: [PATCH v4 4/5] locking: Factor out queued_spin_release() Date: Thu, 26 Mar 2026 15:10:03 +0000 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce queued_spin_release() as an arch-overridable unlock primitive, and make queued_spin_unlock() a generic wrapper around it. This is a preparatory refactoring for the next commit, which adds contended_release tracepoint instrumentation to queued_spin_unlock(). Rename the existing arch-specific queued_spin_unlock() overrides on x86 (paravirt) and MIPS to queued_spin_release(). No functional change. Signed-off-by: Dmitry Ilvokhin --- arch/mips/include/asm/spinlock.h | 6 +++--- arch/x86/include/asm/paravirt-spinlock.h | 6 +++--- include/asm-generic/qspinlock.h | 15 ++++++++++++--- 3 files changed, 18 insertions(+), 9 deletions(-) diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinl= ock.h index 6ce2117e49f6..c349162f15eb 100644 --- a/arch/mips/include/asm/spinlock.h +++ b/arch/mips/include/asm/spinlock.h @@ -13,12 +13,12 @@ =20 #include =20 -#define queued_spin_unlock queued_spin_unlock +#define queued_spin_release queued_spin_release /** - * queued_spin_unlock - release a queued spinlock + * queued_spin_release - release a queued spinlock * @lock : Pointer to queued spinlock structure */ -static inline void queued_spin_unlock(struct qspinlock *lock) +static inline void queued_spin_release(struct qspinlock *lock) { /* This could be optimised with ARCH_HAS_MMIOWB */ mmiowb(); diff --git a/arch/x86/include/asm/paravirt-spinlock.h b/arch/x86/include/as= m/paravirt-spinlock.h index 7beffcb08ed6..ac75e0736198 100644 --- a/arch/x86/include/asm/paravirt-spinlock.h +++ b/arch/x86/include/asm/paravirt-spinlock.h @@ -49,9 +49,9 @@ static __always_inline bool pv_vcpu_is_preempted(long cpu) ALT_NOT(X86_FEATURE_VCPUPREEMPT)); } =20 -#define queued_spin_unlock queued_spin_unlock +#define queued_spin_release queued_spin_release /** - * queued_spin_unlock - release a queued spinlock + * queued_spin_release - release a queued spinlock * @lock : Pointer to queued spinlock structure * * A smp_store_release() on the least-significant byte. @@ -66,7 +66,7 @@ static inline void queued_spin_lock_slowpath(struct qspin= lock *lock, u32 val) pv_queued_spin_lock_slowpath(lock, val); } =20 -static inline void queued_spin_unlock(struct qspinlock *lock) +static inline void queued_spin_release(struct qspinlock *lock) { kcsan_release(); pv_queued_spin_unlock(lock); diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinloc= k.h index bf47cca2c375..df76f34645a0 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -115,12 +115,12 @@ static __always_inline void queued_spin_lock(struct q= spinlock *lock) } #endif =20 -#ifndef queued_spin_unlock +#ifndef queued_spin_release /** - * queued_spin_unlock - release a queued spinlock + * queued_spin_release - release a queued spinlock * @lock : Pointer to queued spinlock structure */ -static __always_inline void queued_spin_unlock(struct qspinlock *lock) +static __always_inline void queued_spin_release(struct qspinlock *lock) { /* * unlock() needs release semantics: @@ -129,6 +129,15 @@ static __always_inline void queued_spin_unlock(struct = qspinlock *lock) } #endif =20 +/** + * queued_spin_unlock - unlock a queued spinlock + * @lock : Pointer to queued spinlock structure + */ +static __always_inline void queued_spin_unlock(struct qspinlock *lock) +{ + queued_spin_release(lock); +} + #ifndef virt_spin_lock static __always_inline bool virt_spin_lock(struct qspinlock *lock) { --=20 2.52.0 From nobody Thu Apr 2 20:28:06 2026 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F1123FB7C2; Thu, 26 Mar 2026 15:10:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774537830; cv=none; b=imVhjkXbbOC8k8L4Ny9N/YW+cqZ9lD7uuBqxKknlzTvNOC6bPjWnqruA8XM0RClNw+q24QitYaGh626i9P+n2pgOi5U8COULthTEiO4UICsYsyXssZolz9i8ke1WmRFtXsQpOexScNGKxZVQhJwqhtVw7RT/HLxQQRe5xktThAE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774537830; c=relaxed/simple; bh=ImpOumTXiHhK2E/RRqtjHi28bu7nryXk7tD/1s0FNio=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AnYZkGy4DMYw+F7RS2e0LVSox5U+FDQ/XN60gUzRqT6QAcPcERHspx7JlG9O3pHihtAJmVQJdN/sidLYdo491y/HYvt0apDS302htnXP8KsSEVOjP6YO7r6Fky5voMdeNnxTFldKyWqCfqSTLwGHwCx4V+YMB0ywYj4m1Bp+jrs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=dLLqDI1p; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="dLLqDI1p" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 9D059BDE1E; Thu, 26 Mar 2026 15:10:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1774537822; bh=vg6orVDTPecMi8OeXzXS8oUaQlC99nRHDPvEe4x4LeU=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=dLLqDI1p21R53wCA8eob+BVI6XwqXJpZuprXjmvBjW/iBsJ7cf+ADApcxOKpwSyya A+lKSNgUYnHa1L7n0SzHyd2w1GjjlxWJLI8NECaewbUXU7VVQTj7a3s0E5eKkMk38Z dNvaBjriNbIT5ix6kJ00hgEddssmRBYJMPY+afno= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, Dmitry Ilvokhin Subject: [PATCH v4 5/5] locking: Add contended_release tracepoint to spinning locks Date: Thu, 26 Mar 2026 15:10:04 +0000 Message-ID: <81eb8e0cd90b31e761e12721dbacb967281f840f.1774536681.git.d@ilvokhin.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend the contended_release tracepoint to queued spinlocks and queued rwlocks. Use the arch-overridable queued_spin_release(), introduced in the previous commit, to ensure the tracepoint works correctly across all architectures, including those with custom unlock implementations (e.g. x86 paravirt). When the tracepoint is disabled, the only addition to the hot path is a single NOP instruction (the static branch). When enabled, the contention check, trace call, and unlock are combined in an out-of-line function to minimize hot path impact, avoiding the compiler needing to preserve the lock pointer in a callee-saved register across the trace call. Binary size impact (x86_64, defconfig): uninlined unlock (common case): +983 bytes (+0.00%) inlined unlock (worst case): +58165 bytes (+0.24%) The inlined unlock case could not be achieved through Kconfig options on x86_64 as PREEMPT_BUILD unconditionally selects UNINLINE_SPIN_UNLOCK on x86_64. The UNINLINE_SPIN_UNLOCK guards were manually inverted to force inline the unlock path and estimate the worst case binary size increase. Architectures with fully custom qspinlock implementations (e.g. PowerPC) are not covered by this change. Signed-off-by: Dmitry Ilvokhin --- include/asm-generic/qrwlock.h | 48 +++++++++++++++++++++++++++------ include/asm-generic/qspinlock.h | 18 +++++++++++++ kernel/locking/qrwlock.c | 16 +++++++++++ kernel/locking/qspinlock.c | 8 ++++++ 4 files changed, 82 insertions(+), 8 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 75b8f4601b28..e24dc537fd66 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -14,6 +14,7 @@ #define __ASM_GENERIC_QRWLOCK_H =20 #include +#include #include #include =20 @@ -35,6 +36,10 @@ */ extern void queued_read_lock_slowpath(struct qrwlock *lock); extern void queued_write_lock_slowpath(struct qrwlock *lock); +extern void queued_read_unlock_traced(struct qrwlock *lock); +extern void queued_write_unlock_traced(struct qrwlock *lock); + +DECLARE_TRACEPOINT(contended_release); =20 /** * queued_read_trylock - try to acquire read lock of a queued rwlock @@ -102,10 +107,16 @@ static inline void queued_write_lock(struct qrwlock *= lock) } =20 /** - * queued_read_unlock - release read lock of a queued rwlock + * queued_rwlock_is_contended - check if the lock is contended * @lock : Pointer to queued rwlock structure + * Return: 1 if lock contended, 0 otherwise */ -static inline void queued_read_unlock(struct qrwlock *lock) +static inline int queued_rwlock_is_contended(struct qrwlock *lock) +{ + return arch_spin_is_locked(&lock->wait_lock); +} + +static __always_inline void __queued_read_unlock(struct qrwlock *lock) { /* * Atomically decrement the reader count @@ -114,22 +125,43 @@ static inline void queued_read_unlock(struct qrwlock = *lock) } =20 /** - * queued_write_unlock - release write lock of a queued rwlock + * queued_read_unlock - release read lock of a queued rwlock * @lock : Pointer to queued rwlock structure */ -static inline void queued_write_unlock(struct qrwlock *lock) +static inline void queued_read_unlock(struct qrwlock *lock) +{ + /* + * Trace and unlock are combined in the traced unlock variant so + * the compiler does not need to preserve the lock pointer across + * the function call, avoiding callee-saved register save/restore + * on the hot path. + */ + if (tracepoint_enabled(contended_release)) { + queued_read_unlock_traced(lock); + return; + } + + __queued_read_unlock(lock); +} + +static __always_inline void __queued_write_unlock(struct qrwlock *lock) { smp_store_release(&lock->wlocked, 0); } =20 /** - * queued_rwlock_is_contended - check if the lock is contended + * queued_write_unlock - release write lock of a queued rwlock * @lock : Pointer to queued rwlock structure - * Return: 1 if lock contended, 0 otherwise */ -static inline int queued_rwlock_is_contended(struct qrwlock *lock) +static inline void queued_write_unlock(struct qrwlock *lock) { - return arch_spin_is_locked(&lock->wait_lock); + /* See comment in queued_read_unlock(). */ + if (tracepoint_enabled(contended_release)) { + queued_write_unlock_traced(lock); + return; + } + + __queued_write_unlock(lock); } =20 /* diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinloc= k.h index df76f34645a0..915a4c2777f6 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -41,6 +41,7 @@ =20 #include #include +#include =20 #ifndef queued_spin_is_locked /** @@ -129,12 +130,29 @@ static __always_inline void queued_spin_release(struc= t qspinlock *lock) } #endif =20 +DECLARE_TRACEPOINT(contended_release); + +extern void queued_spin_release_traced(struct qspinlock *lock); + /** * queued_spin_unlock - unlock a queued spinlock * @lock : Pointer to queued spinlock structure + * + * Generic tracing wrapper around the arch-overridable + * queued_spin_release(). */ static __always_inline void queued_spin_unlock(struct qspinlock *lock) { + /* + * Trace and release are combined in queued_spin_release_traced() so + * the compiler does not need to preserve the lock pointer across the + * function call, avoiding callee-saved register save/restore on the + * hot path. + */ + if (tracepoint_enabled(contended_release)) { + queued_spin_release_traced(lock); + return; + } queued_spin_release(lock); } =20 diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index d2ef312a8611..5f7a0fc2b27a 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -90,3 +90,19 @@ void __lockfunc queued_write_lock_slowpath(struct qrwloc= k *lock) trace_contention_end(lock, 0); } EXPORT_SYMBOL(queued_write_lock_slowpath); + +void __lockfunc queued_read_unlock_traced(struct qrwlock *lock) +{ + if (queued_rwlock_is_contended(lock)) + trace_contended_release(lock); + __queued_read_unlock(lock); +} +EXPORT_SYMBOL(queued_read_unlock_traced); + +void __lockfunc queued_write_unlock_traced(struct qrwlock *lock) +{ + if (queued_rwlock_is_contended(lock)) + trace_contended_release(lock); + __queued_write_unlock(lock); +} +EXPORT_SYMBOL(queued_write_unlock_traced); diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index af8d122bb649..c72610980ec7 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -104,6 +104,14 @@ static __always_inline u32 __pv_wait_head_or_lock(str= uct qspinlock *lock, #define queued_spin_lock_slowpath native_queued_spin_lock_slowpath #endif =20 +void __lockfunc queued_spin_release_traced(struct qspinlock *lock) +{ + if (queued_spin_is_contended(lock)) + trace_contended_release(lock); + queued_spin_release(lock); +} +EXPORT_SYMBOL(queued_spin_release_traced); + #endif /* _GEN_PV_LOCK_SLOWPATH */ =20 /** --=20 2.52.0