I must have missed something, but I can't understand this logic, it
seems unnecessarily complicated today.
1. Now that switch_fpu_finish() doesn't load the FPU state, I think it
can be folded into switch_fpu_prepare().
2. But the main question is that I fail to understand why
__switch_to() -> switch_fpu_finish() uses the "next" task to set
TIF_NEED_FPU_LOAD.
I think that set_tsk_thread_flag(prev_p, TIF_NEED_FPU_LOAD) makes
more sense.
Just in case, note that fpu_clone() sets TIF_NEED_FPU_LOAD, so
we should not worry about the 1st __switch_to(next_p).
IOW, can you explain why the (untested) patch below could be wrong?
We can even remove the PF_KTHREAD check in switch_fpu_prepare(), kthreads
should never clear TIF_NEED_FPU_LOAD...
Oleg.
---
diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h
index 5fd12634bcc4..cdd60f434289 100644
--- a/arch/x86/include/asm/fpu/sched.h
+++ b/arch/x86/include/asm/fpu/sched.h
@@ -54,18 +54,10 @@ static inline void switch_fpu_prepare(struct task_struct *old, int cpu)
*/
old_fpu->last_cpu = cpu;
+ set_tsk_thread_flag(old, TIF_NEED_FPU_LOAD);
+
trace_x86_fpu_regs_deactivated(old_fpu);
}
}
-/*
- * Delay loading of the complete FPU state until the return to userland.
- * PKRU is handled separately.
- */
-static inline void switch_fpu_finish(struct task_struct *new)
-{
- if (cpu_feature_enabled(X86_FEATURE_FPU))
- set_tsk_thread_flag(new, TIF_NEED_FPU_LOAD);
-}
-
#endif /* _ASM_X86_FPU_SCHED_H */
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4636ef359973..b398a6ef2923 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -208,8 +208,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
raw_cpu_write(current_task, next_p);
- switch_fpu_finish(next_p);
-
/* Load the Intel cache allocation PQR MSR. */
resctrl_sched_in(next_p);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 7196ca7048be..e8262e637ea4 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -671,8 +671,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
raw_cpu_write(current_task, next_p);
raw_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
- switch_fpu_finish(next_p);
-
/* Reload sp0. */
update_task_stack(next_p);