From nobody Sat Apr 20 03:37:57 2024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62034C00140 for ; Fri, 5 Aug 2022 17:31:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241335AbiHERbL (ORCPT ); Fri, 5 Aug 2022 13:31:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241317AbiHERag (ORCPT ); Fri, 5 Aug 2022 13:30:36 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 337ED1D32B for ; Fri, 5 Aug 2022 10:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659720630; x=1691256630; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vh4l5KbAwI6TXvI2LGgnuBRyb+jKRxL2UjEp25aMitU=; b=kR4USEvk/dpFs9ScC9nUSL2PNkV4QMokZvrRF4ohNYl5PC+vvM7lhF/V XMdp9UDnRch4j+yTP1p6RaARiXhgCunj23i8OLB9fO//nbDjEYCJ2ZCet fbQJrsZsVIYWzvdtMS/HGnQAB1MMYcASLG3iSOZPeNtUfem187VeIo+6N 8qmrIdGVz8kHh0wkTVPTHeqWjm5WMrqMV8OJKOrN6DdaU+OmybN0Yc33+ NWqQuhN+d+HwIBX+hL6zL174Mc9zCW+LXFDLFYZdk6F3d2FUVI1R7hBbS MOu2HdL7sZRMi8oHfpPwSxfRmL52StoiEAjbQfTsPAtg770jFq0NSlsxp Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="287814613" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="287814613" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:21 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="663064585" Received: from amecham-mobl.amr.corp.intel.com (HELO localhost) ([10.255.0.242]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:21 -0700 From: ira.weiny@intel.com To: Rik van Riel , Borislav Petkov , Dave Hansen Cc: Ira Weiny , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 1/5] entry: Pass pt_regs to irqentry_exit_cond_resched() Date: Fri, 5 Aug 2022 10:30:05 -0700 Message-Id: <20220805173009.3128098-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220805173009.3128098-1-ira.weiny@intel.com> References: <20220805173009.3128098-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Ira Weiny Auxiliary pt_regs space needs to be manipulated by the generic entry/exit code. Ideally irqentry_exit() would take care of handling any auxiliary pt_regs on exit. Unfortunately, irqentry_exit() is not the only exit from exception path. The call to irqentry_exit_cond_resched() from xen_pv_evtchn_do_upcall() bypasses irqentry_exit(). Make irqentry_exit_cond_resched() symmetrical with irqentry_enter() by passing pt_regs to it. This makes irqentry_exit_cond_resched() capable of handling auxiliary pt_regs in future patches. Cc: Rik van Riel Cc: Dave Hansen Cc: Borislav Petkov Signed-off-by: Ira Weiny Acked-by: Rik van Riel --- Forward ported from PKS series: https://lore.kernel.org/lkml/20220419170649.1022246-19-ira.weiny@intel.com/ --- arch/arm64/include/asm/preempt.h | 2 +- arch/arm64/kernel/entry-common.c | 4 ++-- arch/x86/entry/common.c | 2 +- include/linux/entry-common.h | 17 ++++++++------ kernel/entry/common.c | 13 +++++++---- kernel/sched/core.c | 40 ++++++++++++++++---------------- 6 files changed, 43 insertions(+), 35 deletions(-) diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/pree= mpt.h index 0159b625cc7f..bd185a214096 100644 --- a/arch/arm64/include/asm/preempt.h +++ b/arch/arm64/include/asm/preempt.h @@ -87,7 +87,7 @@ void preempt_schedule_notrace(void); =20 #ifdef CONFIG_PREEMPT_DYNAMIC =20 -DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched_internal); void dynamic_preempt_schedule(void); #define __preempt_schedule() dynamic_preempt_schedule() void dynamic_preempt_schedule_notrace(void); diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-com= mon.c index c75ca36b4a49..a1cc8795b729 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -224,9 +224,9 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *= regs) } =20 #ifdef CONFIG_PREEMPT_DYNAMIC -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched_internal); #define need_irq_preemption() \ - (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched_internal)) #else #define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION)) #endif diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 6c2826417b33..f1ba770d035d 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -309,7 +309,7 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct p= t_regs *regs) =20 inhcall =3D get_and_clear_inhcall(); if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) { - irqentry_exit_cond_resched(); + irqentry_exit_cond_resched(regs); instrumentation_end(); restore_inhcall(inhcall); } else { diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index 84a466b176cf..976cce7cf803 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -412,23 +412,26 @@ irqentry_state_t noinstr irqentry_enter(struct pt_reg= s *regs); =20 /** * irqentry_exit_cond_resched - Conditionally reschedule on return from in= terrupt + * @regs: Pointer to pt_regs of interrupted context * * Conditional reschedule with additional sanity checks. */ +void irqentry_exit_cond_resched(struct pt_regs *regs); + void raw_irqentry_exit_cond_resched(void); #ifdef CONFIG_PREEMPT_DYNAMIC #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) -#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_= resched -#define irqentry_exit_cond_resched_dynamic_disabled NULL -DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_res= ched); -#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resche= d)() +#define irqentry_exit_cond_resched_internal_dynamic_enabled raw_irqentry_e= xit_cond_resched +#define irqentry_exit_cond_resched_internal_dynamic_disabled NULL +DECLARE_STATIC_CALL(irqentry_exit_cond_resched_internal, raw_irqentry_exit= _cond_resched); +#define irqentry_exit_cond_resched_internal() static_call(irqentry_exit_co= nd_resched_internal)() #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) -DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched_internal); void dynamic_irqentry_exit_cond_resched(void); -#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() +#define irqentry_exit_cond_resched_internal() dynamic_irqentry_exit_cond_r= esched() #endif #else /* CONFIG_PREEMPT_DYNAMIC */ -#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() +#define irqentry_exit_cond_resched_internal() raw_irqentry_exit_cond_resch= ed() #endif /* CONFIG_PREEMPT_DYNAMIC */ =20 /** diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 063068a9ea9b..8c0f334c4b75 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -387,18 +387,23 @@ void raw_irqentry_exit_cond_resched(void) } #ifdef CONFIG_PREEMPT_DYNAMIC #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) -DEFINE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resc= hed); +DEFINE_STATIC_CALL(irqentry_exit_cond_resched_internal, raw_irqentry_exit_= cond_resched); #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched_internal); void dynamic_irqentry_exit_cond_resched(void) { - if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) + if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched_intern= al)) return; raw_irqentry_exit_cond_resched(); } #endif #endif =20 +void irqentry_exit_cond_resched(struct pt_regs *regs) +{ + irqentry_exit_cond_resched_internal(); +} + noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state) { lockdep_assert_irqs_disabled(); @@ -425,7 +430,7 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqent= ry_state_t state) =20 instrumentation_begin(); if (IS_ENABLED(CONFIG_PREEMPTION)) - irqentry_exit_cond_resched(); + irqentry_exit_cond_resched_internal(); =20 /* Covers both tracing and lockdep */ trace_hardirqs_on(); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 189999007f32..38dd74ba1c86 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8419,29 +8419,29 @@ EXPORT_SYMBOL(__cond_resched_rwlock_write); * SC:might_resched * SC:preempt_schedule * SC:preempt_schedule_notrace - * SC:irqentry_exit_cond_resched + * SC:irqentry_exit_cond_resched_internal * * * NONE: - * cond_resched <- __cond_resched - * might_resched <- RET0 - * preempt_schedule <- NOP - * preempt_schedule_notrace <- NOP - * irqentry_exit_cond_resched <- NOP + * cond_resched <- __cond_resched + * might_resched <- RET0 + * preempt_schedule <- NOP + * preempt_schedule_notrace <- NOP + * irqentry_exit_cond_resched_internal <- NOP * * VOLUNTARY: - * cond_resched <- __cond_resched - * might_resched <- __cond_resched - * preempt_schedule <- NOP - * preempt_schedule_notrace <- NOP - * irqentry_exit_cond_resched <- NOP + * cond_resched <- __cond_resched + * might_resched <- __cond_resched + * preempt_schedule <- NOP + * preempt_schedule_notrace <- NOP + * irqentry_exit_cond_resched_internal <- NOP * * FULL: - * cond_resched <- RET0 - * might_resched <- RET0 - * preempt_schedule <- preempt_schedule - * preempt_schedule_notrace <- preempt_schedule_notrace - * irqentry_exit_cond_resched <- irqentry_exit_cond_resched + * cond_resched <- RET0 + * might_resched <- RET0 + * preempt_schedule <- preempt_schedule + * preempt_schedule_notrace <- preempt_schedule_notrace + * irqentry_exit_cond_resched_internal <- irqentry_exit_cond_resched_int= ernal */ =20 enum { @@ -8487,7 +8487,7 @@ void sched_dynamic_update(int mode) preempt_dynamic_enable(might_resched); preempt_dynamic_enable(preempt_schedule); preempt_dynamic_enable(preempt_schedule_notrace); - preempt_dynamic_enable(irqentry_exit_cond_resched); + preempt_dynamic_enable(irqentry_exit_cond_resched_internal); =20 switch (mode) { case preempt_dynamic_none: @@ -8495,7 +8495,7 @@ void sched_dynamic_update(int mode) preempt_dynamic_disable(might_resched); preempt_dynamic_disable(preempt_schedule); preempt_dynamic_disable(preempt_schedule_notrace); - preempt_dynamic_disable(irqentry_exit_cond_resched); + preempt_dynamic_disable(irqentry_exit_cond_resched_internal); pr_info("Dynamic Preempt: none\n"); break; =20 @@ -8504,7 +8504,7 @@ void sched_dynamic_update(int mode) preempt_dynamic_enable(might_resched); preempt_dynamic_disable(preempt_schedule); preempt_dynamic_disable(preempt_schedule_notrace); - preempt_dynamic_disable(irqentry_exit_cond_resched); + preempt_dynamic_disable(irqentry_exit_cond_resched_internal); pr_info("Dynamic Preempt: voluntary\n"); break; =20 @@ -8513,7 +8513,7 @@ void sched_dynamic_update(int mode) preempt_dynamic_disable(might_resched); preempt_dynamic_enable(preempt_schedule); preempt_dynamic_enable(preempt_schedule_notrace); - preempt_dynamic_enable(irqentry_exit_cond_resched); + preempt_dynamic_enable(irqentry_exit_cond_resched_internal); pr_info("Dynamic Preempt: full\n"); break; } --=20 2.35.3 From nobody Sat Apr 20 03:37:57 2024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0422C00140 for ; Fri, 5 Aug 2022 17:30:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241290AbiHERar (ORCPT ); Fri, 5 Aug 2022 13:30:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241224AbiHERae (ORCPT ); Fri, 5 Aug 2022 13:30:34 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04B7AE090 for ; Fri, 5 Aug 2022 10:30:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659720624; x=1691256624; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sRHTxQHs2LaOvIhv67s67Ysp5FgHFBpgDInzTG4TkTA=; b=AAfix122dcw8RKDOGoJe/RXu5j8L1hHmLzYd8603ATJVi/zC2MFlm48+ h5PWoJNw8jfpOPf+5onaOhvBFj8ZBXUwWg1oB+dvjc4JTHqkxrXhq5SqF V7gJUistp0leuFH3hII8MH26COjoEvFl9GvpXcNqgZEE1MaDOrDZjYUZT kFVpQlW9ZVCIuU7z/WAjbzzqx898Gfp5+VgMnl6BLz1Ux8+PDnPJULjgG tqLKCuaIFhIrh0emMdXanjq7QX+7A98ZZdlUyRS/ZVH4x5qvVWx/f/vvH V1rw8B8JNZnhjL0GJXBhWeV/9I0pXh2OBewBTTFCZln5B8YUs9aLg92D6 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="316144777" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="316144777" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:24 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="730067989" Received: from amecham-mobl.amr.corp.intel.com (HELO localhost) ([10.255.0.242]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:22 -0700 From: ira.weiny@intel.com To: Rik van Riel , Borislav Petkov , Dave Hansen Cc: Ira Weiny , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 2/5] entry: Add calls for save/restore auxiliary pt_regs Date: Fri, 5 Aug 2022 10:30:06 -0700 Message-Id: <20220805173009.3128098-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220805173009.3128098-1-ira.weiny@intel.com> References: <20220805173009.3128098-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Ira Weiny Some architectures have auxiliary pt_regs space available to store information on the stack during exceptions. This information is easier to obtain and store within C code rather than in arch specific assembly. Define empty calls to architecture specific save and restore auxiliary pt_regs functions. Call these functions on generic entry/exit. NOTE: Due to the split nature of the Xen exit code irqentry_exit_cond_resched() requires an unbalanced call to arch_restore_aux_pt_regs(). Cc: Rik van Riel Cc: Dave Hansen Cc: Borislav Petkov Signed-off-by: Ira Weiny Acked-by: Rik van Riel --- Forward ported from PKS series https://lore.kernel.org/lkml/20220419170649.1022246-20-ira.weiny@intel.com/ --- include/linux/entry-common.h | 8 ++++++++ kernel/entry/common.c | 16 ++++++++++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index 976cce7cf803..1c09ba64ad28 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -79,6 +79,14 @@ static __always_inline void arch_enter_from_user_mode(st= ruct pt_regs *regs); static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs= ) {} #endif =20 +#ifndef arch_save_aux_pt_regs +static inline void arch_save_aux_pt_regs(struct pt_regs *regs) { } +#endif + +#ifndef arch_restore_aux_pt_regs +static inline void arch_restore_aux_pt_regs(struct pt_regs *regs) { } +#endif + /** * enter_from_user_mode - Establish state when coming from user mode * diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 8c0f334c4b75..a70a0f314aee 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -317,7 +317,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs = *regs) =20 if (user_mode(regs)) { irqentry_enter_from_user_mode(regs); - return ret; + goto aux_save; } =20 /* @@ -356,7 +356,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs = *regs) instrumentation_end(); =20 ret.exit_rcu =3D true; - return ret; + goto aux_save; } =20 /* @@ -371,6 +371,11 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs= *regs) trace_hardirqs_off_finish(); instrumentation_end(); =20 +aux_save: + instrumentation_begin(); + arch_save_aux_pt_regs(regs); + instrumentation_end(); + return ret; } =20 @@ -401,6 +406,7 @@ void dynamic_irqentry_exit_cond_resched(void) =20 void irqentry_exit_cond_resched(struct pt_regs *regs) { + arch_restore_aux_pt_regs(regs); irqentry_exit_cond_resched_internal(); } =20 @@ -408,6 +414,10 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqen= try_state_t state) { lockdep_assert_irqs_disabled(); =20 + instrumentation_begin(); + arch_restore_aux_pt_regs(regs); + instrumentation_end(); + /* Check whether this returns to user mode */ if (user_mode(regs)) { irqentry_exit_to_user_mode(regs); @@ -459,6 +469,7 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_r= egs *regs) instrumentation_begin(); trace_hardirqs_off_finish(); ftrace_nmi_enter(); + arch_save_aux_pt_regs(regs); instrumentation_end(); =20 return irq_state; @@ -467,6 +478,7 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_r= egs *regs) void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_= state) { instrumentation_begin(); + arch_restore_aux_pt_regs(regs); ftrace_nmi_exit(); if (irq_state.lockdep) { trace_hardirqs_on_prepare(); --=20 2.35.3 From nobody Sat Apr 20 03:37:57 2024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48DF4C00140 for ; Fri, 5 Aug 2022 17:30:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241313AbiHERa4 (ORCPT ); Fri, 5 Aug 2022 13:30:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241306AbiHERag (ORCPT ); Fri, 5 Aug 2022 13:30:36 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8D8314025 for ; Fri, 5 Aug 2022 10:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659720626; x=1691256626; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/SYPw0hqh3agpeHZZc0OpedpZtjOsilGKVjPNfeqUuE=; b=lOE8m/wvOeNvMgKJMHuaw/Nj6qiNYTwEv/hxIroGJw91AnEPqO1JH6PW ITmkGbYjNRxMlhqIK0R5ZgD3vcaxsCZoOzRDL6GQp+pm6/Wz0++TiWhng pqH93p1Pryvj26XPNhwwSiY/PtXYcoXN144WqxQYfN+ndSCxYsc2p5lSe 4wd3knjrZJ4KztW1oQ7WDRrenAr4zKcyjlh8k/5BlmVgQOn1KAqfzAdPT wG31ck6xry7SxMpC0+Hns0QTMZcb6WLQLtawTjBwCQHqHyqfR/J6JNKo6 HC+Q2uQdZEsN+q32FwadWk36FEXl7jxxR9QSLmsZCaL58rXOcpibddbhF w==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="316144786" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="316144786" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:26 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="849395946" Received: from amecham-mobl.amr.corp.intel.com (HELO localhost) ([10.255.0.242]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:24 -0700 From: ira.weiny@intel.com To: Rik van Riel , Borislav Petkov , Dave Hansen Cc: Ira Weiny , Dave Hansen , Dan Williams , Peter Zijlstra , Thomas Gleixner , Andy Lutomirski , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 3/5] x86/entry: Add auxiliary pt_regs space Date: Fri, 5 Aug 2022 10:30:07 -0700 Message-Id: <20220805173009.3128098-4-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220805173009.3128098-1-ira.weiny@intel.com> References: <20220805173009.3128098-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Ira Weiny Rik van Riel reports that knowledge of where a fault hits is valuable in detecting CPU failures in large data centers.[0] Having auxiliary pt_regs space is a useful place to store the CPU and avoids passing additional data through the exception call stacks. Two possible places for preserving this state were originally considered, irqentry_state_t or pt_regs.[1] pt_regs was much more complicated and was potentially fraught with unintended consequences.[2] However, Andy Lutomirski came up with a way to hide additional values on the stack which could be accessed as "extended_pt_regs".[3] This method allows any function with current access to pt_regs to obtain access to the extra information without expanding the use of irqentry_state_t and leaving pt_regs intact for compatibility with outside tools like BPF. Prepare the assembly code to add a hidden auxiliary pt_regs space. To simplify, the assembly code only adds space on the stack as defined by the C code which needs it. The use of this space is left to the C code which is required to select ARCH_HAS_PTREGS_AUXILIARY to enable this support. Each nested exception gets another copy of this auxiliary space allowing for any number of levels of exception handling. Initially the space is left empty and results in no code changes because ARCH_HAS_PTREGS_AUXILIARY is not set. Subsequent patches adding data to pt_regs_auxiliary must set ARCH_HAS_PTREGS_AUXILIARY or a build failure will occur. The use of ARCH_HAS_PTREGS_AUXILIARY also avoids the introduction of 2 instructions (addq/subq) on every entry call when the extra space is not needed. 32bit is specifically excluded. Peter, Thomas, Andy, Dave, and Dan all suggested parts of the patch or aided in the development of the patch.. [0] https://lore.kernel.org/all/20220805101644.2e674553@imladris.surriel.co= m/ [1] https://lore.kernel.org/lkml/CALCETrVe1i5JdyzD_BcctxQJn+ZE3T38EFPgjxN1F= 577M36g+w@mail.gmail.com/ [2] https://lore.kernel.org/lkml/874kpxx4jf.fsf@nanos.tec.linutronix.de/#t [3] https://lore.kernel.org/lkml/CALCETrUHwZPic89oExMMe-WyDY8-O3W68NcZvse3= =3DPGW+iW5=3Dw@mail.gmail.com/ Cc: Rik van Riel Cc: Dave Hansen Cc: Borislav Petkov Suggested-by: Dave Hansen Suggested-by: Dan Williams Suggested-by: Peter Zijlstra Suggested-by: Thomas Gleixner Suggested-by: Andy Lutomirski Signed-off-by: Ira Weiny Acked-by: Rik van Riel --- Forward port from PKS series https://lore.kernel.org/lkml/20220419170649.1022246-18-ira.weiny@intel.com/ --- arch/x86/Kconfig | 4 ++++ arch/x86/entry/calling.h | 19 +++++++++++++++++++ arch/x86/entry/entry_64.S | 22 ++++++++++++++++++++++ arch/x86/entry/entry_64_compat.S | 6 ++++++ arch/x86/include/asm/ptrace.h | 18 ++++++++++++++++++ arch/x86/kernel/asm-offsets_64.c | 15 +++++++++++++++ arch/x86/kernel/head_64.S | 6 ++++++ 7 files changed, 90 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index fb5900e2c29a..b35f6a472e09 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1874,6 +1874,10 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS =20 If unsure, say y. =20 +config ARCH_HAS_PTREGS_AUXILIARY + depends on X86_64 + bool + choice prompt "TSX enable mode" depends on CPU_SUP_INTEL diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index f6907627172b..b7515f8b0092 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -65,6 +65,25 @@ For 32-bit we have the following conventions - kernel is= built with * for assembly code: */ =20 +#ifdef CONFIG_ARCH_HAS_PTREGS_AUXILIARY + +.macro PUSH_PTREGS_AUXILIARY + /* add space for pt_regs_auxiliary */ + subq $PTREGS_AUX_SIZE, %rsp +.endm + +.macro POP_PTREGS_AUXILIARY + /* remove space for pt_regs_auxiliary */ + addq $PTREGS_AUX_SIZE, %rsp +.endm + +#else + +#define PUSH_PTREGS_AUXILIARY +#define POP_PTREGS_AUXILIARY + +#endif + .macro PUSH_REGS rdx=3D%rdx rcx=3D%rcx rax=3D%rax save_ret=3D0 .if \save_ret pushq %rsi /* pt_regs->si */ diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 9953d966d124..4f9f7f5cb563 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -362,7 +362,9 @@ SYM_CODE_END(xen_error_entry) movq $-1, ORIG_RAX(%rsp) /* no syscall to restart */ .endif =20 + PUSH_PTREGS_AUXILIARY call \cfunc + POP_PTREGS_AUXILIARY =20 /* For some configurations \cfunc ends up being a noreturn. */ REACHABLE @@ -472,7 +474,9 @@ SYM_CODE_START(\asmsym) =20 movq %rsp, %rdi /* pt_regs pointer */ =20 + PUSH_PTREGS_AUXILIARY call \cfunc + POP_PTREGS_AUXILIARY =20 jmp paranoid_exit =20 @@ -535,7 +539,9 @@ SYM_CODE_START(\asmsym) * stack. */ movq %rsp, %rdi /* pt_regs pointer */ + PUSH_PTREGS_AUXILIARY call vc_switch_off_ist + POP_PTREGS_AUXILIARY movq %rax, %rsp /* Switch to new stack */ =20 ENCODE_FRAME_POINTER @@ -547,7 +553,9 @@ SYM_CODE_START(\asmsym) =20 movq %rsp, %rdi /* pt_regs pointer */ =20 + PUSH_PTREGS_AUXILIARY call kernel_\cfunc + POP_PTREGS_AUXILIARY =20 /* * No need to switch back to the IST stack. The current stack is either @@ -584,7 +592,9 @@ SYM_CODE_START(\asmsym) movq %rsp, %rdi /* pt_regs pointer into first argument */ movq ORIG_RAX(%rsp), %rsi /* get error code into 2nd argument*/ movq $-1, ORIG_RAX(%rsp) /* no syscall to restart */ + PUSH_PTREGS_AUXILIARY call \cfunc + POP_PTREGS_AUXILIARY =20 /* For some configurations \cfunc ends up being a noreturn. */ REACHABLE @@ -838,7 +848,9 @@ SYM_CODE_START_LOCAL(exc_xen_hypervisor_callback) movq %rdi, %rsp /* we don't return, adjust the stack frame */ UNWIND_HINT_REGS =20 + PUSH_PTREGS_AUXILIARY call xen_pv_evtchn_do_upcall + POP_PTREGS_AUXILIARY =20 jmp error_return SYM_CODE_END(exc_xen_hypervisor_callback) @@ -1062,7 +1074,9 @@ SYM_CODE_START_LOCAL(error_entry) .Lerror_entry_from_usermode_after_swapgs: =20 /* Put us onto the real thread stack. */ + PUSH_PTREGS_AUXILIARY call sync_regs + POP_PTREGS_AUXILIARY RET =20 /* @@ -1119,7 +1133,9 @@ SYM_CODE_START_LOCAL(error_entry) * as if we faulted immediately after IRET. */ leaq 8(%rsp), %rdi /* arg0 =3D pt_regs pointer */ + PUSH_PTREGS_AUXILIARY call fixup_bad_iret + POP_PTREGS_AUXILIARY mov %rax, %rdi jmp .Lerror_entry_from_usermode_after_swapgs SYM_CODE_END(error_entry) @@ -1229,7 +1245,9 @@ SYM_CODE_START(asm_exc_nmi) =20 movq %rsp, %rdi movq $-1, %rsi + PUSH_PTREGS_AUXILIARY call exc_nmi + POP_PTREGS_AUXILIARY =20 /* * Return back to user mode. We must *not* do the normal exit @@ -1265,6 +1283,8 @@ SYM_CODE_START(asm_exc_nmi) * +---------------------------------------------------------+ * | pt_regs | * +---------------------------------------------------------+ + * | (Optionally) pt_regs_extended | + * +---------------------------------------------------------+ * * The "original" frame is used by hardware. Before re-enabling * NMIs, we need to be done with it, and we need to leave enough @@ -1443,7 +1463,9 @@ end_repeat_nmi: =20 movq %rsp, %rdi movq $-1, %rsi + PUSH_PTREGS_AUXILIARY call exc_nmi + POP_PTREGS_AUXILIARY =20 /* Always restore stashed SPEC_CTRL value (see paranoid_entry) */ IBRS_EXIT save_reg=3D%r15 diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_com= pat.S index 682338e7e2a3..7f1e670f7b06 100644 --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -117,7 +117,9 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_after_hwframe, SY= M_L_GLOBAL) .Lsysenter_flags_fixed: =20 movq %rsp, %rdi + PUSH_PTREGS_AUXILIARY call do_SYSENTER_32 + POP_PTREGS_AUXILIARY /* XEN PV guests always use IRET path */ ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_userm= ode", \ "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV @@ -212,7 +214,9 @@ SYM_INNER_LABEL(entry_SYSCALL_compat_after_hwframe, SYM= _L_GLOBAL) UNTRAIN_RET =20 movq %rsp, %rdi + PUSH_PTREGS_AUXILIARY call do_fast_syscall_32 + POP_PTREGS_AUXILIARY /* XEN PV guests always use IRET path */ ALTERNATIVE "testl %eax, %eax; jz swapgs_restore_regs_and_return_to_userm= ode", \ "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV @@ -351,6 +355,8 @@ SYM_CODE_START(entry_INT80_compat) UNTRAIN_RET =20 movq %rsp, %rdi + PUSH_PTREGS_AUXILIARY call do_int80_syscall_32 + POP_PTREGS_AUXILIARY jmp swapgs_restore_regs_and_return_to_usermode SYM_CODE_END(entry_INT80_compat) diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index f4db78b09c8f..5a9c85893459 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_PTRACE_H #define _ASM_X86_PTRACE_H =20 +#include #include #include #include @@ -91,6 +92,23 @@ struct pt_regs { /* top of stack page */ }; =20 +/* + * NOTE: Features which add data to pt_regs_auxiliary must select + * ARCH_HAS_PTREGS_AUXILIARY. Failure to do so will result in a build fai= lure. + */ +struct pt_regs_auxiliary { +}; + +struct pt_regs_extended { + struct pt_regs_auxiliary aux; + struct pt_regs pt_regs __aligned(8); +}; + +static inline struct pt_regs_extended *to_extended_pt_regs(struct pt_regs = *regs) +{ + return container_of(regs, struct pt_regs_extended, pt_regs); +} + #endif /* !__i386__ */ =20 #ifdef CONFIG_PARAVIRT diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets= _64.c index 9b698215d261..413fe632445b 100644 --- a/arch/x86/kernel/asm-offsets_64.c +++ b/arch/x86/kernel/asm-offsets_64.c @@ -4,6 +4,7 @@ #endif =20 #include +#include =20 #if defined(CONFIG_KVM_GUEST) #include @@ -60,5 +61,19 @@ int main(void) DEFINE(stack_canary_offset, offsetof(struct fixed_percpu_data, stack_cana= ry)); BLANK(); #endif + +#ifdef CONFIG_ARCH_HAS_PTREGS_AUXILIARY + /* Size of Auxiliary pt_regs data */ + DEFINE(PTREGS_AUX_SIZE, sizeof(struct pt_regs_extended) - + sizeof(struct pt_regs)); +#else + /* + * Adding data to struct pt_regs_auxiliary requires setting + * ARCH_HAS_PTREGS_AUXILIARY + */ + BUILD_BUG_ON((sizeof(struct pt_regs_extended) - + sizeof(struct pt_regs)) !=3D 0); +#endif + return 0; } diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index d860d437631b..3a41273acb1c 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -398,8 +398,10 @@ SYM_CODE_START_NOALIGN(vc_boot_ghcb) movq %rsp, %rdi movq ORIG_RAX(%rsp), %rsi movq initial_vc_handler(%rip), %rax + PUSH_PTREGS_AUXILIARY ANNOTATE_RETPOLINE_SAFE call *%rax + POP_PTREGS_AUXILIARY =20 /* Unwind pt_regs */ POP_REGS @@ -479,7 +481,9 @@ SYM_CODE_START_LOCAL(early_idt_handler_common) UNWIND_HINT_REGS =20 movq %rsp,%rdi /* RDI =3D pt_regs; RSI is already trapnr */ + PUSH_PTREGS_AUXILIARY call do_early_exception + POP_PTREGS_AUXILIARY =20 decl early_recursion_flag(%rip) jmp restore_regs_and_return_to_kernel @@ -508,7 +512,9 @@ SYM_CODE_START_NOALIGN(vc_no_ghcb) /* Call C handler */ movq %rsp, %rdi movq ORIG_RAX(%rsp), %rsi + PUSH_PTREGS_AUXILIARY call do_vc_no_ghcb + POP_PTREGS_AUXILIARY =20 /* Unwind pt_regs */ POP_REGS --=20 2.35.3 From nobody Sat Apr 20 03:37:57 2024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6029C25B07 for ; Fri, 5 Aug 2022 17:31:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241407AbiHERa6 (ORCPT ); Fri, 5 Aug 2022 13:30:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241307AbiHERag (ORCPT ); Fri, 5 Aug 2022 13:30:36 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB3471CFCC for ; Fri, 5 Aug 2022 10:30:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659720627; x=1691256627; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2dM1WuVb6OYAaV0IBScAMcULpTf1F1saRBlYGODMl4Y=; b=O5NQFTmWrIh3RahCydOgJk4XM95UBTpl1FJzoA2lIfaeXtzKEqJZim0Y R99n+uoeLEnIBj9hLQCqjjBrCiyYPzwAoYiyvpHt5sN3PTPh/M1I7M5lJ dnH+am/nvT1wVZ7OecFSNyuyQBCDoSn8T0Mz97uM4l8Kk/vLVhQm8Hk0j ANFbj9PRCZnjGFKc62+ZZSZ25H09b9FceMULrN4w6YaMPozUw/He6zj/K KLThPCnZlVxievzZ8T/NTuCZHW492AWBHA6E9X+VtshR2+D+EFAjMdzNz D9lAc/3nAn8WQgYDzjSbl6MTHL8MX1DP9JI3Z7585aLoMgzVlpFkRUseX Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="270634129" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="270634129" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:27 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="706678695" Received: from amecham-mobl.amr.corp.intel.com (HELO localhost) ([10.255.0.242]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:26 -0700 From: ira.weiny@intel.com To: Rik van Riel , Borislav Petkov , Dave Hansen Cc: Dave Jones , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 4/5] x86,mm: print likely CPU at segfault time Date: Fri, 5 Aug 2022 10:30:08 -0700 Message-Id: <20220805173009.3128098-5-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220805173009.3128098-1-ira.weiny@intel.com> References: <20220805173009.3128098-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Rik van Riel In a large enough fleet of computers, it is common to have a few bad CPUs. Those can often be identified by seeing that some commonly run kernel code, which runs fine everywhere else, keeps crashing on the same CPU core on one particular bad system. However, the failure modes in CPUs that have gone bad over the years are often oddly specific, and the only bad behavior seen might be segfaults in programs like bash, python, or various system daemons that run fine everywhere else. Add a printk() to show_signal_msg() to print the CPU, core, and socket at segfault time. This is not perfect, since the task might get rescheduled on another CPU between when the fault hit, and when the message is printed, but in practice this has been good enough to help us identify several bad CPU cores. segfault[1349]: segfault at 0 ip 000000000040113a sp 00007ffc6d32e360 error= 4 in segfault[401000+1000] on CPU 0 (core 0, socket 0) This printk can be controlled through /proc/sys/debug/exception-trace Signed-off-by: Rik van Riel CC: Dave Jones --- arch/x86/mm/fault.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 971977c438fc..82cf23975aa1 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -769,6 +769,8 @@ show_signal_msg(struct pt_regs *regs, unsigned long err= or_code, unsigned long address, struct task_struct *tsk) { const char *loglvl =3D task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG; + /* This is a racy snapshot, but it's better than nothing. */ + int cpu =3D raw_smp_processor_id(); =20 if (!unhandled_signal(tsk, SIGSEGV)) return; @@ -782,6 +784,14 @@ show_signal_msg(struct pt_regs *regs, unsigned long er= ror_code, =20 print_vma_addr(KERN_CONT " in ", regs->ip); =20 + /* + * Dump the likely CPU where the fatal segfault happened. + * This can help identify faulty hardware. + */ + printk(KERN_CONT " on CPU %d (core %d, socket %d)", cpu, + topology_core_id(cpu), topology_physical_package_id(cpu)); + + printk(KERN_CONT "\n"); =20 show_opcodes(regs, loglvl); --=20 2.35.3 From nobody Sat Apr 20 03:37:57 2024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F4F9C00140 for ; Fri, 5 Aug 2022 17:31:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241349AbiHERbT (ORCPT ); Fri, 5 Aug 2022 13:31:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241318AbiHERah (ORCPT ); Fri, 5 Aug 2022 13:30:37 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33A051DA45 for ; Fri, 5 Aug 2022 10:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659720630; x=1691256630; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iUggmBbv/Gb/pP6+Y41ZS7OmaZpg2F4gACjroqSJRII=; b=Dwf8tOHyzbEVprbqAc9GRqWsYyOXda4W7shOayppfpqjLZYrT6ojQ0MH qKuVwoa8rGH+hEYVL9gV5vPA6LgYi8GPmzKVNYGZIQ0FQhUNiADH1VUCc wCXO7f6B5IYnXNd4cOQeDqq2V3yRsxwAQbYD24vHRk1Mua7wJzh2xLu4S 9hlfEEgaQ1vSErUCiB5finiUBy3pnwSkri94qckClO8fU8wQMS225IkAf Vo7BPpZxjInrXxH651zKd7Sb4VqAvw9FhdclyMeNpKV4MFNG5VVjGKgdd eFyAaok4nJ9QGjNG+K1c/i3PBTLzHMEyeyCQT6IRgbgiYM8t2k1vNttwC Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="316144805" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="316144805" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:29 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="636565369" Received: from amecham-mobl.amr.corp.intel.com (HELO localhost) ([10.255.0.242]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 10:30:28 -0700 From: ira.weiny@intel.com To: Rik van Riel , Borislav Petkov , Dave Hansen Cc: Ira Weiny , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 5/5] x86/entry: Store CPU info on exception entry Date: Fri, 5 Aug 2022 10:30:09 -0700 Message-Id: <20220805173009.3128098-6-ira.weiny@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220805173009.3128098-1-ira.weiny@intel.com> References: <20220805173009.3128098-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Ira Weiny The CPU information of an exception is useful in determining where bad CPUs are in a large data center. Define arch_{save|restore}_auxiliary_pt_regs() and set ARCH_HAS_PTREGS_AUXILIARY default to yes. Store the CPU on exception entry and use it later. Cc: Rik van Riel Suggested-by: Borislav Petkov Suggested-by: Dave Hansen Signed-off-by: Ira Weiny Acked-by: Rik van Riel --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/entry-common.h | 12 ++++++++++++ arch/x86/include/asm/ptrace.h | 1 + arch/x86/mm/fault.c | 4 ++-- 4 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b35f6a472e09..707650a6ecb2 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1876,7 +1876,7 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS =20 config ARCH_HAS_PTREGS_AUXILIARY depends on X86_64 - bool + def_bool y =20 choice prompt "TSX enable mode" diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/ent= ry-common.h index 674ed46d3ced..eb145106929a 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -95,4 +95,16 @@ static __always_inline void arch_exit_to_user_mode(void) } #define arch_exit_to_user_mode arch_exit_to_user_mode =20 +#ifdef CONFIG_ARCH_HAS_PTREGS_AUXILIARY + +static inline void arch_save_aux_pt_regs(struct pt_regs *regs) +{ + struct pt_regs_auxiliary *aux_pt_regs =3D &to_extended_pt_regs(regs)->aux; + + aux_pt_regs->cpu =3D raw_smp_processor_id(); +} +#define arch_save_aux_pt_regs arch_save_aux_pt_regs + +#endif + #endif diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 5a9c85893459..b403b469996f 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -97,6 +97,7 @@ struct pt_regs { * ARCH_HAS_PTREGS_AUXILIARY. Failure to do so will result in a build fai= lure. */ struct pt_regs_auxiliary { + int cpu; }; =20 struct pt_regs_extended { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 82cf23975aa1..5df99fe49494 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -768,9 +768,9 @@ static inline void show_signal_msg(struct pt_regs *regs, unsigned long error_code, unsigned long address, struct task_struct *tsk) { + struct pt_regs_auxiliary *aux_pt_regs =3D &to_extended_pt_regs(regs)->aux; const char *loglvl =3D task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG; - /* This is a racy snapshot, but it's better than nothing. */ - int cpu =3D raw_smp_processor_id(); + int cpu =3D aux_pt_regs->cpu; =20 if (!unhandled_signal(tsk, SIGSEGV)) return; --=20 2.35.3