[PATCH v4 0/5] Xen INTX/GSI event channel delivery fixes

David Woodhouse posted 5 patches 3 years, 3 months ago
Failed in applying to current master (apply log)
Documentation/admin-guide/kernel-parameters.txt |  4 ++
arch/arm/xen/enlighten.c                        |  2 +-
arch/x86/xen/enlighten_hvm.c                    | 15 ++++-
arch/x86/xen/smp_hvm.c                          | 27 ++++++---
drivers/xen/events/events_base.c                | 10 ---
drivers/xen/platform-pci.c                      |  8 ++-
drivers/xen/xenbus/xenbus.h                     |  1 +
drivers/xen/xenbus/xenbus_comms.c               |  8 ---
drivers/xen/xenbus/xenbus_probe.c               | 81 ++++++++++++++++++++-----
include/xen/xenbus.h                            |  2 +-
10 files changed, 110 insertions(+), 48 deletions(-)
[PATCH v4 0/5] Xen INTX/GSI event channel delivery fixes
Posted by David Woodhouse 3 years, 3 months ago
Fix various breakages with INTX/GSI event channel interrupt delivery,
and add a command line option to test it so that it hopefully doesn't
get so broken again.

Karim attempted to rip this out in 2017 but put it back because it's still
necessary with older versions of Xen. This fixes it properly, and makes it
easier to test. cf. https://lkml.org/lkml/2017/4/10/266

v2:
 • Add no_vector_callback to Documentation/admin-guide/kernel-parameters.txt
 • Further fixes to conditional smp_ops initialisation in xen_hvm_smp_init()

v3:
 • Rename test option to xen_no_vector_callback

v4:
 • Fix unconditional references to xen_have_vector_callback in Arm build

David Woodhouse (5):
      xen: Fix event channel callback via INTX/GSI
      xen: Set platform PCI device INTX affinity to CPU0
      x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
      x86/xen: Don't register Xen IPIs when they aren't going to be used
      x86/xen: Fix xen_hvm_smp_init() when vector callback not available

 Documentation/admin-guide/kernel-parameters.txt |  4 ++
 arch/arm/xen/enlighten.c                        |  2 +-
 arch/x86/xen/enlighten_hvm.c                    | 15 ++++-
 arch/x86/xen/smp_hvm.c                          | 27 ++++++---
 drivers/xen/events/events_base.c                | 10 ---
 drivers/xen/platform-pci.c                      |  8 ++-
 drivers/xen/xenbus/xenbus.h                     |  1 +
 drivers/xen/xenbus/xenbus_comms.c               |  8 ---
 drivers/xen/xenbus/xenbus_probe.c               | 81 ++++++++++++++++++++-----
 include/xen/xenbus.h                            |  2 +-
 10 files changed, 110 insertions(+), 48 deletions(-)




Re: [PATCH v4 0/5] Xen INTX/GSI event channel delivery fixes
Posted by Jürgen Groß 3 years, 3 months ago
On 13.01.21 14:26, David Woodhouse wrote:
> Fix various breakages with INTX/GSI event channel interrupt delivery,
> and add a command line option to test it so that it hopefully doesn't
> get so broken again.
> 
> Karim attempted to rip this out in 2017 but put it back because it's still
> necessary with older versions of Xen. This fixes it properly, and makes it
> easier to test. cf. https://lkml.org/lkml/2017/4/10/266
> 
> v2:
>   • Add no_vector_callback to Documentation/admin-guide/kernel-parameters.txt
>   • Further fixes to conditional smp_ops initialisation in xen_hvm_smp_init()
> 
> v3:
>   • Rename test option to xen_no_vector_callback
> 
> v4:
>   • Fix unconditional references to xen_have_vector_callback in Arm build
> 
> David Woodhouse (5):
>        xen: Fix event channel callback via INTX/GSI
>        xen: Set platform PCI device INTX affinity to CPU0
>        x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
>        x86/xen: Don't register Xen IPIs when they aren't going to be used
>        x86/xen: Fix xen_hvm_smp_init() when vector callback not available
> 
>   Documentation/admin-guide/kernel-parameters.txt |  4 ++
>   arch/arm/xen/enlighten.c                        |  2 +-
>   arch/x86/xen/enlighten_hvm.c                    | 15 ++++-
>   arch/x86/xen/smp_hvm.c                          | 27 ++++++---
>   drivers/xen/events/events_base.c                | 10 ---
>   drivers/xen/platform-pci.c                      |  8 ++-
>   drivers/xen/xenbus/xenbus.h                     |  1 +
>   drivers/xen/xenbus/xenbus_comms.c               |  8 ---
>   drivers/xen/xenbus/xenbus_probe.c               | 81 ++++++++++++++++++++-----
>   include/xen/xenbus.h                            |  2 +-
>   10 files changed, 110 insertions(+), 48 deletions(-)

Series pushed to xen/tip.git for-linus-5.11


Juergen
[PATCH] xen: Fix XenStore initialisation for XS_LOCAL
Posted by David Woodhouse 3 years, 3 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
I reworked the triggering of xenbus_probe().

I tried to simplify things by taking out the workqueue based startup
triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
handler.

I missed the fact that in the XS_LOCAL case (Dom0 starting its own
xenstored or xenstore-stubdom, which happens after the kernel is booted
completely), that IRQ-based trigger is still actually needed.

So... put it back, except more cleanly. By just spawning a xenbus_probe
thread which waits on xb_waitq and runs the probe the first time it
gets woken, just as the workqueue-based hack did.

This is actually a nicer approach for *all* the back ends with different
interrupt methods, and we can switch them all over to that without the
complex conditions for when to trigger it. But not in -rc6. This is
the minimal fix for the regression, although it's a step in the right
direction instead of doing a partial revert and actually putting the
workqueue back. It's also simpler than the workqueue.

Fixes: 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 drivers/xen/xenbus/xenbus_probe.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index c8f0282bb649..18ffd0551b54 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -714,6 +714,23 @@ static bool xs_hvm_defer_init_for_callback(void)
 #endif
 }
 
+static int xenbus_probe_thread(void *unused)
+{
+	DEFINE_WAIT(w);
+
+	/*
+	 * We actually just want to wait for *any* trigger of xb_waitq,
+	 * and run xenbus_probe() the moment it occurs.
+	 */
+	prepare_to_wait(&xb_waitq, &w, TASK_INTERRUPTIBLE);
+	schedule();
+	finish_wait(&xb_waitq, &w);
+
+	DPRINTK("probing");
+	xenbus_probe();
+	return 0;
+}
+
 static int __init xenbus_probe_initcall(void)
 {
 	/*
@@ -725,6 +742,20 @@ static int __init xenbus_probe_initcall(void)
 	     !xs_hvm_defer_init_for_callback()))
 		xenbus_probe();
 
+	/*
+	 * For XS_LOCAL, spawn a thread which will wait for xenstored
+	 * or a xenstore-stubdom to be started, then probe. It will be
+	 * triggered when communication starts happening, by waiting
+	 * on xb_waitq.
+	 */
+	if (xen_store_domain_type == XS_LOCAL) {
+		struct task_struct *probe_task;
+
+		probe_task = kthread_run(xenbus_probe_thread, NULL,
+					 "xenbus_probe");
+		if (IS_ERR(probe_task))
+			return PTR_ERR(probe_task);
+	}
 	return 0;
 }
 device_initcall(xenbus_probe_initcall);
-- 
2.29.2

Re: [PATCH] xen: Fix XenStore initialisation for XS_LOCAL
Posted by Boris Ostrovsky 3 years, 3 months ago

On 1/26/21 12:01 PM, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
> I reworked the triggering of xenbus_probe().
> 
> I tried to simplify things by taking out the workqueue based startup
> triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
> handler.
> 
> I missed the fact that in the XS_LOCAL case (Dom0 starting its own
> xenstored or xenstore-stubdom, which happens after the kernel is booted
> completely), that IRQ-based trigger is still actually needed.
> 
> So... put it back, except more cleanly. By just spawning a xenbus_probe
> thread which waits on xb_waitq and runs the probe the first time it
> gets woken, just as the workqueue-based hack did.
> 
> This is actually a nicer approach for *all* the back ends with different
> interrupt methods, and we can switch them all over to that without the
> complex conditions for when to trigger it. But not in -rc6. This is
> the minimal fix for the regression, although it's a step in the right
> direction instead of doing a partial revert and actually putting the
> workqueue back. It's also simpler than the workqueue.


Wouldn't the minimal fix be to restore wake_waiting() to its previous 

        if (unlikely(xenstored_ready == 0)) {
                xenstored_ready = 1;
                schedule_work(&probe_work);
        }

(And to avoid changing xenbus_probe()'s signature just create a wrapper)

-boris


Re: [PATCH] xen: Fix XenStore initialisation for XS_LOCAL
Posted by Jürgen Groß 3 years, 3 months ago
On 26.01.21 22:36, Boris Ostrovsky wrote:
> 
> 
> On 1/26/21 12:01 PM, David Woodhouse wrote:
>> From: David Woodhouse <dwmw@amazon.co.uk>
>>
>> In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
>> I reworked the triggering of xenbus_probe().
>>
>> I tried to simplify things by taking out the workqueue based startup
>> triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
>> handler.
>>
>> I missed the fact that in the XS_LOCAL case (Dom0 starting its own
>> xenstored or xenstore-stubdom, which happens after the kernel is booted
>> completely), that IRQ-based trigger is still actually needed.
>>
>> So... put it back, except more cleanly. By just spawning a xenbus_probe
>> thread which waits on xb_waitq and runs the probe the first time it
>> gets woken, just as the workqueue-based hack did.
>>
>> This is actually a nicer approach for *all* the back ends with different
>> interrupt methods, and we can switch them all over to that without the
>> complex conditions for when to trigger it. But not in -rc6. This is
>> the minimal fix for the regression, although it's a step in the right
>> direction instead of doing a partial revert and actually putting the
>> workqueue back. It's also simpler than the workqueue.
> 
> 
> Wouldn't the minimal fix be to restore wake_waiting() to its previous
> 
>          if (unlikely(xenstored_ready == 0)) {
>                  xenstored_ready = 1;
>                  schedule_work(&probe_work);
>          }
> 
> (And to avoid changing xenbus_probe()'s signature just create a wrapper)

David and I had a longer chat on IRC regarding this fix.

The long term idea is to have just his current thread based variant
for all cases calling xenbus_probe() (so no call of xenbus_probe() at
any other place).

We agreed that this approach would be too risky now, but we wanted to
go in the right direction with the current fix. This is the outcome.


Juergen
Re: [PATCH] xen: Fix XenStore initialisation for XS_LOCAL
Posted by Jürgen Groß 3 years, 3 months ago
On 26.01.21 18:01, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
> I reworked the triggering of xenbus_probe().
> 
> I tried to simplify things by taking out the workqueue based startup
> triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
> handler.
> 
> I missed the fact that in the XS_LOCAL case (Dom0 starting its own
> xenstored or xenstore-stubdom, which happens after the kernel is booted
> completely), that IRQ-based trigger is still actually needed.
> 
> So... put it back, except more cleanly. By just spawning a xenbus_probe
> thread which waits on xb_waitq and runs the probe the first time it
> gets woken, just as the workqueue-based hack did.
> 
> This is actually a nicer approach for *all* the back ends with different
> interrupt methods, and we can switch them all over to that without the
> complex conditions for when to trigger it. But not in -rc6. This is
> the minimal fix for the regression, although it's a step in the right
> direction instead of doing a partial revert and actually putting the
> workqueue back. It's also simpler than the workqueue.
> 
> Fixes: 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>

Committed to: xen/tip.git for-linus-5.11


Juergen