[PATCH] x86/fpu: Clarify FPU context cacheline alignment

Ingo Molnar posted 1 patch 10 months ago
There is a newer version of this series
arch/x86/kernel/fpu/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH] x86/fpu: Clarify FPU context cacheline alignment
Posted by Ingo Molnar 10 months ago

* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Apr 10, 2025 at 12:10:56PM +0200, Ingo Molnar wrote:
> > 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > On Wed, Apr 09, 2025 at 11:11:23PM +0200, Ingo Molnar wrote:
> > > 
> > > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > > > index 5ea7e5d2c4de..b7f7c9c83409 100644
> > > > --- a/arch/x86/include/asm/processor.h
> > > > +++ b/arch/x86/include/asm/processor.h
> > > > @@ -514,12 +514,9 @@ struct thread_struct {
> > > >  
> > > >  	struct thread_shstk	shstk;
> > > >  #endif
> > > > -
> > > > -	/* Floating point and extended processor state */
> > > > -	struct fpu		*fpu;
> > > >  };
> > > >  
> > > > -#define x86_task_fpu(task) ((task)->thread.fpu)
> > > > +#define x86_task_fpu(task)	((struct fpu *)((void *)(task) + sizeof(*(task))))
> > > 
> > > Doesn't our FPU state need to be cacheline aligned?
> > 
> > Yeah, and we do have a check for that:
> > 
> > +       BUILD_BUG_ON(sizeof(*dst) % SMP_CACHE_BYTES != 0);
> 
> Ah, missed that. Clearly I need to improve my reading skillz :-)

Admittedly it's written a bit obtusely - how about the patch below?

Thanks,

	Ingo

============================>
From: Ingo Molnar <mingo@kernel.org>
Date: Thu, 10 Apr 2025 12:52:16 +0200
Subject: [PATCH] x86/fpu: Clarify FPU context cacheline alignment

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/kernel/fpu/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d0a45f6492cb..3a19877a314e 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -607,7 +607,8 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal,
 	 * We allocate the new FPU structure right after the end of the task struct.
 	 * task allocation size already took this into account.
 	 *
-	 * This is safe because task_struct size is a multiple of cacheline size.
+	 * This is safe because task_struct size is a multiple of cacheline size,
+	 * thus x86_task_fpu() will always be cacheline aligned as well.
 	 */
 	struct fpu *dst_fpu = (void *)dst + sizeof(*dst);
[tip: x86/merge] x86/fpu: Clarify FPU context cacheline alignment
Posted by tip-bot2 for Ingo Molnar 9 months, 4 weeks ago
The following commit has been merged into the x86/merge branch of tip:

Commit-ID:     e3a52b67f54aa36ab21265eeea016460b5fe1c46
Gitweb:        https://git.kernel.org/tip/e3a52b67f54aa36ab21265eeea016460b5fe1c46
Author:        Ingo Molnar <mingo@kernel.org>
AuthorDate:    Thu, 10 Apr 2025 12:52:16 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 14 Apr 2025 08:18:29 +02:00

x86/fpu: Clarify FPU context cacheline alignment

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/Z_ejggklB5-IWB5W@gmail.com
---
 arch/x86/kernel/fpu/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d0a45f6..3a19877 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -607,7 +607,8 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal,
 	 * We allocate the new FPU structure right after the end of the task struct.
 	 * task allocation size already took this into account.
 	 *
-	 * This is safe because task_struct size is a multiple of cacheline size.
+	 * This is safe because task_struct size is a multiple of cacheline size,
+	 * thus x86_task_fpu() will always be cacheline aligned as well.
 	 */
 	struct fpu *dst_fpu = (void *)dst + sizeof(*dst);