From nobody Sun May 10 15:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 306BFC433F5 for ; Sun, 1 May 2022 19:31:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347615AbiEATfP (ORCPT ); Sun, 1 May 2022 15:35:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231638AbiEATfN (ORCPT ); Sun, 1 May 2022 15:35:13 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58FA64E3A5; Sun, 1 May 2022 12:31:46 -0700 (PDT) Message-ID: <20220501193102.588689270@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1651433504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Xh00mBTIPhiL8G3sTnLFewMyTOf/QnQuL4kbITowOlY=; b=mRcYKlxGiBvr0VWf1j+KfTraywDaOEabXj+tQJB0pXGtMuyfP7nekJI7hrH9snBEGrYi9t JW4Dw5xY7Lj0i/eYPLz7Ral3pHPrKLWtUdEYaEpzdWAoVeH45m7W8T/z8o2THgonrpbqL3 qBLumciwRqgistyIEbuGYhWKZoCGvNP8E1YNPrcV0dxDQzEeC3CmSZ5yLCkXZZS3LqYV9Q VvMGH0X3lqNFvCtxcEUETgQjxoQQz96YY6gKCSfUvZ4AbLsCDSBH3rv9W/hiPWAQHzPIMS xjb8VZfAR5V8Y0Bfoke/2e5L8FVjReM7gPPLpmBORu4sCM77IjzoPlaUc7fqew== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1651433504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Xh00mBTIPhiL8G3sTnLFewMyTOf/QnQuL4kbITowOlY=; b=ozCVrDoCzB05zO60lAMcMwkSCjU7hs4Zoly+Ar87eKVl14g1QXVIV6QIGQHT3a0mw9GUOC g9v8HsLGNrCX9bCg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Filipe Manana , stable@vger.kernel.org Subject: [patch 1/3] x86/fpu: Prevent FPU state corruption References: <20220501192740.203963477@linutronix.de> MIME-Version: 1.0 Date: Sun, 1 May 2022 21:31:43 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The FPU usage related to task FPU management is either protected by disabling interrupts (switch_to, return to user) or via fpregs_lock() which is a wrapper around local_bh_disable(). When kernel code wants to use the FPU then it has to check whether it is possible by calling irq_fpu_usable(). But the condition in irq_fpu_usable() is wrong. It allows FPU to be used when: !in_interrupt() || interrupted_user_mode() || interrupted_kernel_fpu_idl= e() The latter is checking whether some other context already uses FPU in the kernel, but if that's not the case then it allows FPU to be used unconditionally even if the calling context interupted a fpregs_lock() critical region. If that happens then the FPU state of the interrupted context becomes corrupted. Allow in kernel FPU usage only when no other context has in kernel FPU usage and either the calling context is not hard interrupt context or the hard interrupt did not interrupt a local bottomhalf disabled region. It's hard to find a proper Fixes tag as the condition was broken in one way or the other for a very long time and the eager/lazy FPU changes caused a lot of churn. Picked something remotely connected from the history. This survived undetected for quite some time as FPU usage in interrupt context is rare, but the recent changes to the random code unearthed it at least on a kernel which had FPU debugging enabled. There is probably a higher rate of silent corruption as not all issues can be detected by the FPU debugging code. This will be addressed in a subsequent change. Fixes: 5d2bd7009f30 ("x86, fpu: decouple non-lazy/eager fpu restore from xs= ave") Reported-by: Filipe Manana Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Reviewed-by: Borislav Petkov Tested-by: Filipe Manana --- arch/x86/kernel/fpu/core.c | 67 +++++++++++++++++-----------------------= ----- 1 file changed, 26 insertions(+), 41 deletions(-) --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -41,17 +41,7 @@ struct fpu_state_config fpu_user_cfg __r */ struct fpstate init_fpstate __ro_after_init; =20 -/* - * Track whether the kernel is using the FPU state - * currently. - * - * This flag is used: - * - * - by IRQ context code to potentially use the FPU - * if it's unused. - * - * - to debug kernel_fpu_begin()/end() correctness - */ +/* Track in-kernel FPU usage */ static DEFINE_PER_CPU(bool, in_kernel_fpu); =20 /* @@ -59,42 +49,37 @@ static DEFINE_PER_CPU(bool, in_kernel_fp */ DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); =20 -static bool kernel_fpu_disabled(void) -{ - return this_cpu_read(in_kernel_fpu); -} - -static bool interrupted_kernel_fpu_idle(void) -{ - return !kernel_fpu_disabled(); -} - -/* - * Were we in user mode (or vm86 mode) when we were - * interrupted? - * - * Doing kernel_fpu_begin/end() is ok if we are running - * in an interrupt context from user mode - we'll just - * save the FPU state as required. - */ -static bool interrupted_user_mode(void) -{ - struct pt_regs *regs =3D get_irq_regs(); - return regs && user_mode(regs); -} - /* * Can we use the FPU in kernel mode with the * whole "kernel_fpu_begin/end()" sequence? - * - * It's always ok in process context (ie "not interrupt") - * but it is sometimes ok even from an irq. */ bool irq_fpu_usable(void) { - return !in_interrupt() || - interrupted_user_mode() || - interrupted_kernel_fpu_idle(); + if (WARN_ON_ONCE(in_nmi())) + return false; + + /* In kernel FPU usage already active? */ + if (this_cpu_read(in_kernel_fpu)) + return false; + + /* + * When not in NMI or hard interrupt context, FPU can be used: + * + * - Task context is safe except from within fpregs_lock()'ed + * critical regions. + * + * - Soft interrupt processing context which cannot happen + * while in a fpregs_lock()'ed critical region. + */ + if (!in_hardirq()) + return true; + + /* + * In hard interrupt context it's safe when soft interrupts + * are enabled, which means the interrupt did not hit in + * a fpregs_lock()'ed critical region. + */ + return !softirq_count(); } EXPORT_SYMBOL(irq_fpu_usable); From nobody Sun May 10 15:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F395C433F5 for ; Sun, 1 May 2022 19:31:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353946AbiEATfT (ORCPT ); Sun, 1 May 2022 15:35:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349167AbiEATfP (ORCPT ); Sun, 1 May 2022 15:35:15 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43F1D4EA0D for ; Sun, 1 May 2022 12:31:47 -0700 (PDT) Message-ID: <20220501193102.647328777@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1651433505; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=v+9fFLHr/r9ZBLxNWhpzb94rMAgopHuWB0AJT/8ddKg=; b=eGtwhc+QBZi1YdFqmtGrp5p+Th+WoXuSO2++6ssLMDNgFpGTsy9Uj+dmQDueSsvBN5GS7H 72TRA1diVbdd+SkYjVumwiy4e887jD2rrizSCx+q1Xe7v0DbJJN1znsZ5QKexL8qsFsUZl 2vL6vT24MeixrAiiz3HRO+An86nBghRXYAfkUF7J1YJJDhouNiwNfmd9F0bgvpGUSf4eJd pKlmIJS8xoqJof34s39f1T4mop0dHfF37BpD+tO6t8UillLrWC3EwUBfXyw/jIl7TpfRHx 3c+PsszqwAMLS5W/s1bzTVUxDbAN3F4skiuOrjNDWXk2Wi4zdPh+fetegbYMaA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1651433505; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=v+9fFLHr/r9ZBLxNWhpzb94rMAgopHuWB0AJT/8ddKg=; b=J+bagRZ6UzP1AwhFr5tWI4cxwd3UnSW7FZkyQ6cENZRy2BguA6WOWOae1nlHrW6IO5gHV3 LD5MDKrumNArnoAQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Filipe Manana Subject: [patch 2/3] x86/fpu: Rename irq_fpu_usable() References: <20220501192740.203963477@linutronix.de> MIME-Version: 1.0 Date: Sun, 1 May 2022 21:31:45 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This can be invoked from almost any context, except NMI. The only requirement is that the context is not migrateable. Rename it to kernel_fpu_usable(). Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Tested-by: Filipe Manana --- arch/um/include/asm/fpu/api.h | 2 +- arch/x86/include/asm/fpu/api.h | 4 ++-- arch/x86/include/asm/simd.h | 2 +- arch/x86/kernel/fpu/core.c | 13 +++++++------ net/netfilter/nft_set_pipapo_avx2.c | 2 +- 5 files changed, 12 insertions(+), 11 deletions(-) --- a/arch/um/include/asm/fpu/api.h +++ b/arch/um/include/asm/fpu/api.h @@ -11,7 +11,7 @@ #define kernel_fpu_begin() (void)0 #define kernel_fpu_end() (void)0 =20 -static inline bool irq_fpu_usable(void) +static inline bool kernel_fpu_usable(void) { return true; } --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -19,7 +19,7 @@ * disables preemption so be careful if you intend to use it for long peri= ods * of time. * If you intend to use the FPU in irq/softirq you need to check first with - * irq_fpu_usable() if it is possible. + * kernel_fpu_usable() if it is possible. */ =20 /* Kernel FPU states to initialize in kernel_fpu_begin_mask() */ @@ -28,7 +28,7 @@ =20 extern void kernel_fpu_begin_mask(unsigned int kfpu_mask); extern void kernel_fpu_end(void); -extern bool irq_fpu_usable(void); +extern bool kernel_fpu_usable(void); extern void fpregs_mark_activate(void); =20 /* Code that is unaware of kernel_fpu_begin_mask() can use this */ --- a/arch/x86/include/asm/simd.h +++ b/arch/x86/include/asm/simd.h @@ -8,5 +8,5 @@ */ static __must_check inline bool may_use_simd(void) { - return irq_fpu_usable(); + return kernel_fpu_usable(); } --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -49,11 +49,12 @@ static DEFINE_PER_CPU(bool, in_kernel_fp */ DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); =20 -/* - * Can we use the FPU in kernel mode with the - * whole "kernel_fpu_begin/end()" sequence? +/** + * kernel_fpu_usable - Check whether kernel FPU usage is possible + * + * Has to be invoked before calling kernel_fpu_begin(). */ -bool irq_fpu_usable(void) +bool kernel_fpu_usable(void) { if (WARN_ON_ONCE(in_nmi())) return false; @@ -81,7 +82,7 @@ bool irq_fpu_usable(void) */ return !softirq_count(); } -EXPORT_SYMBOL(irq_fpu_usable); +EXPORT_SYMBOL(kernel_fpu_usable); =20 /* * Track AVX512 state use because it is known to slow the max clock @@ -409,7 +410,7 @@ void kernel_fpu_begin_mask(unsigned int { preempt_disable(); =20 - WARN_ON_FPU(!irq_fpu_usable()); + WARN_ON_FPU(!kernel_fpu_usable()); WARN_ON_FPU(this_cpu_read(in_kernel_fpu)); =20 this_cpu_write(in_kernel_fpu, true); --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -1128,7 +1128,7 @@ bool nft_pipapo_avx2_lookup(const struct bool map_index; int i, ret =3D 0; =20 - if (unlikely(!irq_fpu_usable())) + if (unlikely(!kernel_fpu_usable())) return nft_pipapo_lookup(net, set, key, ext); =20 m =3D rcu_dereference(priv->match); From nobody Sun May 10 15:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7AAFC433F5 for ; Sun, 1 May 2022 19:32:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355107AbiEATf2 (ORCPT ); Sun, 1 May 2022 15:35:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350172AbiEATfP (ORCPT ); Sun, 1 May 2022 15:35:15 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDD5B4ECD4 for ; Sun, 1 May 2022 12:31:48 -0700 (PDT) Message-ID: <20220501193102.704267030@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1651433507; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=VhCINNAHjctUPAk0XN6VM4ogLWSiR8LdSlWPAIPLcPQ=; b=fIsebEYWSTYkwhrz4w4tzNXjuW8+RGrpRF8B0Mbh8dAVYWg0el+Jit5x8deho7O7Y8VZe3 acK5gViad1A5iE2kAa/6PbEvs1hvacsCL4d4yoMb1hBGKvcxginRG98a+QBvB5J19lKtyO xYIkX9hfbT5F20f3erOYdyp2HLadD8QU2WBSW1TS+3VW8YKN0CKkgpasvT3CAyzEVNQ8iG 2Box9HBH1AQ3mAhUO3hBLCPKoLvBNswX3gtSoBPCzaqwZM3pmyIlfLQDJcJTgXNNgXOF40 wHyWjq7d1zIQkqB56LfDNyrHWWcdmT+O/LQyXKAPzo9YJG3T5YzPESnsYw2FmA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1651433507; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=VhCINNAHjctUPAk0XN6VM4ogLWSiR8LdSlWPAIPLcPQ=; b=XGFDMO/58UfDmodWSpSg/Trgc9OmNYuIH34uhLJ3VFnfk8CAvwb6F3Q03QQnVP4y3flscM Oapktq4pRxr6JCAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Filipe Manana Subject: [patch 3/3] x86/fpu: Make FPU protection more robust References: <20220501192740.203963477@linutronix.de> MIME-Version: 1.0 Date: Sun, 1 May 2022 21:31:47 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" FPU state maintenance is protected by fpregs_lock(), which is a wrapper around local_bh_disable() on non-RT kernels and preempt_disable() on RT kernels. In-kernel FPU usage has it's own protection via a per CPU variable. This separation is pointless and error-prone as a recently discovered wrong condition for granting in-kernel FPU usage has shown. Make the whole FPU state protection simpler and more robust by using the per CPU usage variable for all FPU operations so state is tracked consistently. Change related WARN_ON_FPU() instances to WARN_ON_ONCE() as the usage of CONFIG_X86_DEBUG_FPU is optional and hides inconsistencies for a potentially long time. Signed-off-by: Thomas Gleixner Tested-by: Filipe Manana --- arch/x86/include/asm/fpu/api.h | 17 +------- arch/x86/kernel/fpu/core.c | 78 ++++++++++++++++++++++++++----------= ----- 2 files changed, 52 insertions(+), 43 deletions(-) --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -66,21 +66,8 @@ static inline void kernel_fpu_begin(void * * Disabling preemption also serializes against kernel_fpu_begin(). */ -static inline void fpregs_lock(void) -{ - if (!IS_ENABLED(CONFIG_PREEMPT_RT)) - local_bh_disable(); - else - preempt_disable(); -} - -static inline void fpregs_unlock(void) -{ - if (!IS_ENABLED(CONFIG_PREEMPT_RT)) - local_bh_enable(); - else - preempt_enable(); -} +extern void fpregs_lock(void); +extern void fpregs_unlock(void); =20 #ifdef CONFIG_X86_DEBUG_FPU extern void fpregs_assert_state_consistent(void); --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -42,7 +42,7 @@ struct fpu_state_config fpu_user_cfg __r struct fpstate init_fpstate __ro_after_init; =20 /* Track in-kernel FPU usage */ -static DEFINE_PER_CPU(bool, in_kernel_fpu); +static DEFINE_PER_CPU(bool, fpu_in_use); =20 /* * Track which context is using the FPU on the CPU: @@ -50,6 +50,50 @@ static DEFINE_PER_CPU(bool, in_kernel_fp DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); =20 /** + * fpregs_lock - Lock FPU state for maintenance operations + * + * This protects against preemption, soft interrupts and in-kernel FPU + * usage on both !RT and RT enabled kernels. + * + * !RT kernels use local_bh_disable() to prevent soft interrupt processing + * and preemption. + * + * On RT kernels local_bh_disable() is not sufficient because it only + * serializes soft interrupt related sections via a local lock, but stays + * preemptible. Disabling preemption is the right choice here as bottom + * half processing is always in thread context on RT kernels so it + * implicitly prevents bottom half processing as well. + */ +void fpregs_lock(void) +{ + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_disable(); + else + preempt_disable(); + + WARN_ON_ONCE(this_cpu_read(fpu_in_use)); + this_cpu_write(fpu_in_use, true); +} +EXPORT_SYMBOL_GPL(fpregs_lock); + +/** + * fpregs_unlock - Unlock FPU state after maintenance operations + * + * Counterpart to fpregs_lock(). + */ +void fpregs_unlock(void) +{ + WARN_ON_ONCE(!this_cpu_read(fpu_in_use)); + this_cpu_write(fpu_in_use, false); + + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_enable(); + else + preempt_enable(); +} +EXPORT_SYMBOL_GPL(fpregs_unlock); + +/** * kernel_fpu_usable - Check whether kernel FPU usage is possible * * Has to be invoked before calling kernel_fpu_begin(). @@ -59,28 +103,7 @@ bool kernel_fpu_usable(void) if (WARN_ON_ONCE(in_nmi())) return false; =20 - /* In kernel FPU usage already active? */ - if (this_cpu_read(in_kernel_fpu)) - return false; - - /* - * When not in NMI or hard interrupt context, FPU can be used: - * - * - Task context is safe except from within fpregs_lock()'ed - * critical regions. - * - * - Soft interrupt processing context which cannot happen - * while in a fpregs_lock()'ed critical region. - */ - if (!in_hardirq()) - return true; - - /* - * In hard interrupt context it's safe when soft interrupts - * are enabled, which means the interrupt did not hit in - * a fpregs_lock()'ed critical region. - */ - return !softirq_count(); + return !this_cpu_read(fpu_in_use); } EXPORT_SYMBOL(kernel_fpu_usable); =20 @@ -410,10 +433,9 @@ void kernel_fpu_begin_mask(unsigned int { preempt_disable(); =20 - WARN_ON_FPU(!kernel_fpu_usable()); - WARN_ON_FPU(this_cpu_read(in_kernel_fpu)); + WARN_ON_ONCE(!kernel_fpu_usable()); =20 - this_cpu_write(in_kernel_fpu, true); + this_cpu_write(fpu_in_use, true); =20 if (!(current->flags & PF_KTHREAD) && !test_thread_flag(TIF_NEED_FPU_LOAD)) { @@ -433,9 +455,9 @@ EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask) =20 void kernel_fpu_end(void) { - WARN_ON_FPU(!this_cpu_read(in_kernel_fpu)); + WARN_ON_ONCE(!this_cpu_read(fpu_in_use)); =20 - this_cpu_write(in_kernel_fpu, false); + this_cpu_write(fpu_in_use, false); preempt_enable(); } EXPORT_SYMBOL_GPL(kernel_fpu_end);