From nobody Mon Oct 6 03:12:40 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33C8F245000 for ; Sat, 26 Jul 2025 14:12:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753539137; cv=none; b=d2/XbMQewFLgWGZuYQn2bHqbhDapU9DgZNRnxg4DESFKSD9plBhfb+WvaWwwserssr4ky6DhlF9odMS622exnaNdCfAxvFIS0+/a2mkfMr7EHwGvSpva9ahp22QBorRhlIOcnMO5J3Mud6I0B3WwFsyQ151jLU2r2m04PTotRiA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753539137; c=relaxed/simple; bh=k8G4+w7Lveoc7WwBZA/bdbyFE3GFQKX3cxXTnI8tWgs=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=YaukqxSHClnYZfbxNpiLEa18pla1aXA72T99Oe46F/KbquH01wwmxOQfmd/rQ7cTh0Ks/NVsYTKPQs1xBOZpmo6LpO1mIpV/h6ojc8u3IdyKLe1yxewJWPIrtbvhR56filoogEvvcwIu0WbF2G4LvOHdQxHkvn4mqKjswYdx3AM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PhMvs9b4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PhMvs9b4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 08AEDC4CEF4; Sat, 26 Jul 2025 14:12:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1753539137; bh=k8G4+w7Lveoc7WwBZA/bdbyFE3GFQKX3cxXTnI8tWgs=; h=Date:From:To:Cc:Subject:References:From; b=PhMvs9b4xCwX2v0A3ALQIc28l81BRdwZfUfeqRBtatrQDyv3J7S+0WzK93sm0OiBa buD6WcXCUuFAgi74//jdx8Zc/Kt7biy5j9nBplo3rpYiux2ZD0hfa0Db1DHTzP4jxP 86U/Ib1o81bo2H8SsyikFDgCPbZwtnMxKtchzHVz+SI15sj/EDZo4wHSCcsfLxnkj1 5g3TROX5C+IpzZZqL18aSdMQSdkouqvv2DynUGfAg+T0gvqQarKCCRCKRgxt2HXKYh hTFih9WQF8u2IP6ZNEtqYc252kzkKoEF/S+uaiC8rOfMkeMRw3zLgG0kvudp6FNLWF A4LdFmOW4777w== Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uffdQ-00000001sdc-3F5e; Sat, 26 Jul 2025 10:12:24 -0400 Message-ID: <20250726141224.627257236@kernel.org> User-Agent: quilt/0.68 Date: Sat, 26 Jul 2025 10:07:09 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Peter Zijlstra , Linus Torvalds , Ingo Molnar , Josh Poimboeuf , Jiri Olsa , Arnaldo Carvalho de Melo , Namhyung Kim , Thomas Gleixner , Andrii Nakryiko , Indu Bhagat , "Jose E. Marchesi" , Beau Belgrave , Jens Remus , Jens Axboe , Florian Weimer , Sam James Subject: [for-next][PATCH 05/10] unwind_user/deferred: Make unwind deferral requests NMI-safe References: <20250726140704.560579628@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt Make unwind_deferred_request() NMI-safe so tracers in NMI context can call it and safely request a user space stacktrace when the task exits. Note, this is only allowed for architectures that implement a safe cmpxchg. If an architecture requests a deferred stack trace from NMI context that does not support a safe NMI cmpxchg, it will get an -EINVAL and trigger a warning. For those architectures, they would need another method (perhaps an irqwork), to request a deferred user space stack trace. That can be dealt with later if one of theses architectures require this feature. Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: Josh Poimboeuf Cc: Ingo Molnar Cc: Jiri Olsa Cc: Arnaldo Carvalho de Melo Cc: Namhyung Kim Cc: Thomas Gleixner Cc: Andrii Nakryiko Cc: Indu Bhagat Cc: "Jose E. Marchesi" Cc: Beau Belgrave Cc: Jens Remus Cc: Linus Torvalds Cc: Andrew Morton Cc: Jens Axboe Cc: Florian Weimer Cc: Sam James Link: https://lore.kernel.org/20250725185739.906767342@kernel.org Suggested-by: Peter Zijlstra Signed-off-by: Steven Rostedt (Google) --- kernel/unwind/deferred.c | 52 +++++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 8 deletions(-) diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c index 2cbae2ada309..c5ac087d2396 100644 --- a/kernel/unwind/deferred.c +++ b/kernel/unwind/deferred.c @@ -12,6 +12,31 @@ #include #include =20 +/* + * For requesting a deferred user space stack trace from NMI context + * the architecture must support a safe cmpxchg in NMI context. + * For those architectures that do not have that, then it cannot ask + * for a deferred user space stack trace from an NMI context. If it + * does, then it will get -EINVAL. + */ +#if defined(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) +# define CAN_USE_IN_NMI 1 +static inline bool try_assign_cnt(struct unwind_task_info *info, u32 cnt) +{ + u32 old =3D 0; + + return try_cmpxchg(&info->id.cnt, &old, cnt); +} +#else +# define CAN_USE_IN_NMI 0 +/* When NMIs are not allowed, this always succeeds */ +static inline bool try_assign_cnt(struct unwind_task_info *info, u32 cnt) +{ + info->id.cnt =3D cnt; + return true; +} +#endif + /* Make the cache fit in a 4K page */ #define UNWIND_MAX_ENTRIES \ ((SZ_4K - sizeof(struct unwind_cache)) / sizeof(long)) @@ -42,14 +67,13 @@ static DEFINE_PER_CPU(u32, unwind_ctx_ctr); static u64 get_cookie(struct unwind_task_info *info) { u32 cnt =3D 1; - u32 old =3D 0; =20 if (info->id.cpu) return info->id.id; =20 /* LSB is always set to ensure 0 is an invalid value */ cnt |=3D __this_cpu_read(unwind_ctx_ctr) + 2; - if (try_cmpxchg(&info->id.cnt, &old, cnt)) { + if (try_assign_cnt(info, cnt)) { /* Update the per cpu counter */ __this_cpu_write(unwind_ctx_ctr, cnt); } @@ -167,31 +191,43 @@ static void unwind_deferred_task_work(struct callback= _head *head) int unwind_deferred_request(struct unwind_work *work, u64 *cookie) { struct unwind_task_info *info =3D ¤t->unwind_info; + long pending; int ret; =20 *cookie =3D 0; =20 - if (WARN_ON_ONCE(in_nmi())) - return -EINVAL; - if ((current->flags & (PF_KTHREAD | PF_EXITING)) || !user_mode(task_pt_regs(current))) return -EINVAL; =20 + /* + * NMI requires having safe cmpxchg operations. + * Trigger a warning to make it obvious that an architecture + * is using this in NMI when it should not be. + */ + if (WARN_ON_ONCE(!CAN_USE_IN_NMI && in_nmi())) + return -EINVAL; + guard(irqsave)(); =20 *cookie =3D get_cookie(info); =20 /* callback already pending? */ - if (info->pending) + pending =3D READ_ONCE(info->pending); + if (pending) + return 1; + + /* Claim the work unless an NMI just now swooped in to do so. */ + if (!try_cmpxchg(&info->pending, &pending, 1)) return 1; =20 /* The work has been claimed, now schedule it. */ ret =3D task_work_add(current, &info->work, TWA_RESUME); - if (WARN_ON_ONCE(ret)) + if (WARN_ON_ONCE(ret)) { + WRITE_ONCE(info->pending, 0); return ret; + } =20 - info->pending =3D 1; return 0; } =20 --=20 2.47.2