[PATCH v5] x86/split_lock: fix delayed detection enabling

Maksim Davydov posted 1 patch 9 months ago
arch/x86/kernel/cpu/bus_lock.c | 35 +++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
[PATCH v5] x86/split_lock: fix delayed detection enabling
Posted by Maksim Davydov 9 months ago
If the warn mode with disabled mitigation mode is used, then on each
CPU where the split lock occurred detection will be disabled in order to
make progress and delayed work will be scheduled, which then will enable
detection back. Now it turns out that all CPUs use one global delayed
work structure. This leads to the fact that if a split lock occurs on
several CPUs at the same time (within 2 jiffies), only one CPU will
schedule delayed work, but the rest will not. The return value of
schedule_delayed_work_on() would have shown this, but it is not checked
in the code.

A diagram that can help to understand the bug reproduction:
https://lore.kernel.org/all/2cd54041-253b-4e78-b8ea-dbe9b884ff9b@yandex-team.ru/

In order to fix the warn mode with disabled mitigation mode, delayed work
has to be a per-CPU.

v5 -> v4:
* using pure_initcall for per-CPU structure initialization instead of
  deferred initialization

v4 -> v3:
* rebased the patch onto the latest master

v3 -> v2:
* place and time of the per-CPU structure initialization were changed.
  initcall doesn't seem to be a good place for it, so deferred
  initialization is used.

Fixes: 727209376f49 ("x86/split_lock: Add sysctl to control the misery mode")
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
---
 arch/x86/kernel/cpu/bus_lock.c | 35 +++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/bus_lock.c b/arch/x86/kernel/cpu/bus_lock.c
index 6cba85c79d42..1a8112dba37a 100644
--- a/arch/x86/kernel/cpu/bus_lock.c
+++ b/arch/x86/kernel/cpu/bus_lock.c
@@ -192,7 +192,32 @@ static void __split_lock_reenable(struct work_struct *work)
 {
 	sld_update_msr(true);
 }
-static DECLARE_DELAYED_WORK(sl_reenable, __split_lock_reenable);
+/*
+ * In order for each CPU to schedule itself delayed work independently of the
+ * others, delayed work struct should be per-CPU. This is not required when
+ * sysctl_sld_mitigate is enabled because of the semaphore, that limits
+ * the number of simultaneously scheduled delayed works to 1.
+ */
+static DEFINE_PER_CPU(struct delayed_work, sl_reenable);
+
+/*
+ * Per-CPU delayed_work can't be statically initialized properly because
+ * the struct address is unknown. Thus per-CPU delayed_work structures
+ * have to be initialized during kernel initialization.
+ */
+static int __init setup_split_lock_delayed_work(void)
+{
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct delayed_work *work = per_cpu_ptr(&sl_reenable, cpu);
+
+		INIT_DELAYED_WORK(work, __split_lock_reenable);
+	}
+
+	return 0;
+}
+pure_initcall(setup_split_lock_delayed_work);
 
 /*
  * If a CPU goes offline with pending delayed work to re-enable split lock
@@ -215,13 +240,14 @@ static void split_lock_warn(unsigned long ip)
 {
 	struct delayed_work *work;
 	int cpu;
+	unsigned int saved_sld_mitigate = READ_ONCE(sysctl_sld_mitigate);
 
 	if (!current->reported_split_lock)
 		pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
 				    current->comm, current->pid, ip);
 	current->reported_split_lock = 1;
 
-	if (sysctl_sld_mitigate) {
+	if (saved_sld_mitigate) {
 		/*
 		 * misery factor #1:
 		 * sleep 10ms before trying to execute split lock.
@@ -234,12 +260,11 @@ static void split_lock_warn(unsigned long ip)
 		 */
 		if (down_interruptible(&buslock_sem) == -EINTR)
 			return;
-		work = &sl_reenable_unlock;
-	} else {
-		work = &sl_reenable;
 	}
 
 	cpu = get_cpu();
+	work = (saved_sld_mitigate ?
+		&sl_reenable_unlock : per_cpu_ptr(&sl_reenable, cpu));
 	schedule_delayed_work_on(cpu, work, 2);
 
 	/* Disable split lock detection on this CPU to make progress */
-- 
2.34.1
Re: [PATCH v5] x86/split_lock: fix delayed detection enabling
Posted by Ingo Molnar 9 months ago
* Maksim Davydov <davydov-max@yandex-team.ru> wrote:

> If the warn mode with disabled mitigation mode is used, then on each
> CPU where the split lock occurred detection will be disabled in order to
> make progress and delayed work will be scheduled, which then will enable
> detection back. Now it turns out that all CPUs use one global delayed
> work structure. This leads to the fact that if a split lock occurs on
> several CPUs at the same time (within 2 jiffies), only one CPU will
> schedule delayed work, but the rest will not. The return value of
> schedule_delayed_work_on() would have shown this, but it is not checked
> in the code.

So we already merged the previous version into the locking tree ~10 
days ago and it's all in -next already:

  c929d08df8be ("x86/split_lock: Fix the delayed detection logic")

  https://web.git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=c929d08df8bee855528b9d15b853c892c54e1eee

Is there anything new in your -v5 patch, other than undoing all the 
changelog cleanups I did for the previous version? ;-)

Thanks,

	Ingo
Re: [PATCH v5] x86/split_lock: fix delayed detection enabling
Posted by Maksim Davydov 9 months ago

On 3/18/25 23:24, Ingo Molnar wrote:
> 
> * Maksim Davydov <davydov-max@yandex-team.ru> wrote:
> 
>> If the warn mode with disabled mitigation mode is used, then on each
>> CPU where the split lock occurred detection will be disabled in order to
>> make progress and delayed work will be scheduled, which then will enable
>> detection back. Now it turns out that all CPUs use one global delayed
>> work structure. This leads to the fact that if a split lock occurs on
>> several CPUs at the same time (within 2 jiffies), only one CPU will
>> schedule delayed work, but the rest will not. The return value of
>> schedule_delayed_work_on() would have shown this, but it is not checked
>> in the code.
> 
> So we already merged the previous version into the locking tree ~10
> days ago and it's all in -next already:
> 
>    c929d08df8be ("x86/split_lock: Fix the delayed detection logic")
> 
>    https://web.git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=c929d08df8bee855528b9d15b853c892c54e1eee
> 
> Is there anything new in your -v5 patch, other than undoing all the
> changelog cleanups I did for the previous version? ;-)
> 

Oh, sorry, I missed it.
Yes, in v5 initcall is used instead of deferred initialization.
Either v4 or v5 are good for me. Please be free to choose the more 
convenient variant for you. :-)

> Thanks,
> 
> 	Ingo

-- 
Best regards,
Maksim Davydov
Re: [PATCH v5] x86/split_lock: fix delayed detection enabling
Posted by Ingo Molnar 8 months, 4 weeks ago
* Maksim Davydov <davydov-max@yandex-team.ru> wrote:

> 
> 
> On 3/18/25 23:24, Ingo Molnar wrote:
> > 
> > * Maksim Davydov <davydov-max@yandex-team.ru> wrote:
> > 
> > > If the warn mode with disabled mitigation mode is used, then on each
> > > CPU where the split lock occurred detection will be disabled in order to
> > > make progress and delayed work will be scheduled, which then will enable
> > > detection back. Now it turns out that all CPUs use one global delayed
> > > work structure. This leads to the fact that if a split lock occurs on
> > > several CPUs at the same time (within 2 jiffies), only one CPU will
> > > schedule delayed work, but the rest will not. The return value of
> > > schedule_delayed_work_on() would have shown this, but it is not checked
> > > in the code.
> > 
> > So we already merged the previous version into the locking tree ~10
> > days ago and it's all in -next already:
> > 
> >    c929d08df8be ("x86/split_lock: Fix the delayed detection logic")
> > 
> >    https://web.git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=c929d08df8bee855528b9d15b853c892c54e1eee
> > 
> > Is there anything new in your -v5 patch, other than undoing all the
> > changelog cleanups I did for the previous version? ;-)
> > 
> 
> Oh, sorry, I missed it.
> Yes, in v5 initcall is used instead of deferred initialization.
> Either v4 or v5 are good for me. Please be free to choose the more
> convenient variant for you. :-)

Could you please send a delta patch on top of tip:master (or -next) 
that implements the initcall approach? Basically -v5, but on top of 
-v4.

I merged -v4 because I thought the fix was delayed enough already and 
-v4 was functionally fine too, but I won't say no to even better code! :-)

Thanks,

	Ingo