[PATCH] xen: Fix event channel callback via INTX/GSI

David Woodhouse posted 1 patch 3 years, 3 months ago
Failed in applying to current master (apply log)
arch/arm/xen/enlighten.c          |  2 +-
arch/x86/xen/enlighten_hvm.c      | 11 ++++-
drivers/xen/events/events_base.c  | 10 -----
drivers/xen/platform-pci.c        |  7 ++++
drivers/xen/xenbus/xenbus.h       |  1 +
drivers/xen/xenbus/xenbus_comms.c |  8 ----
drivers/xen/xenbus/xenbus_probe.c | 68 ++++++++++++++++++++++++-------
include/xen/xenbus.h              |  2 +-
8 files changed, 74 insertions(+), 35 deletions(-)
[PATCH] xen: Fix event channel callback via INTX/GSI
Posted by David Woodhouse 3 years, 3 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

For a while, event channel notification via the PCI platform device
has been broken, because we attempt to communicate with xenstore before
we even have notifications working, in the xs_reset_watches() call
called from xs_init().

This mostly doesn't matter much in practice because for Xen versions
below 4.0 we avoid xs_reset_watches() anyway, because xenstore might
not cope with reading a non-existent key. And newer Xen *does* have
the vector callback support, so we rarely fall back to INTX/GSI
delivery.

But those working on Xen and Xen-compatible hypervisor implementations
do want to test that INTX/GSI delivery works correctly, as there are
still guest kernels in active use which don't use vector delivery yet.
So it's useful to have it actually working.

To that end, clean up a bit of the mess of xs_init() and xenbus_probe()
startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM
case, deferring it to be called from xenbus_probe() in the XS_HVM case
instead.

Then fix up the invocation of xenbus_probe() to happen either from its
device_initcall if the callback is available early enough, or when the
callback is finally set up. This means that the hack of calling
xenbus_probe() from a workqueue after the first interrupt is no longer
needed.

Add a 'no_vector_callback' command line argument to test it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/arm/xen/enlighten.c          |  2 +-
 arch/x86/xen/enlighten_hvm.c      | 11 ++++-
 drivers/xen/events/events_base.c  | 10 -----
 drivers/xen/platform-pci.c        |  7 ++++
 drivers/xen/xenbus/xenbus.h       |  1 +
 drivers/xen/xenbus/xenbus_comms.c |  8 ----
 drivers/xen/xenbus/xenbus_probe.c | 68 ++++++++++++++++++++++++-------
 include/xen/xenbus.h              |  2 +-
 8 files changed, 74 insertions(+), 35 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 60e901cd0de6..5a957a9a0984 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -371,7 +371,7 @@ static int __init xen_guest_init(void)
 	}
 	gnttab_init();
 	if (!xen_initial_domain())
-		xenbus_probe(NULL);
+		xenbus_probe();
 
 	/*
 	 * Making sure board specific code will not set up ops for
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 9e87ab010c82..a1c07e0c888e 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -188,6 +188,8 @@ static int xen_cpu_dead_hvm(unsigned int cpu)
        return 0;
 }
 
+static bool no_vector_callback __initdata;
+
 static void __init xen_hvm_guest_init(void)
 {
 	if (xen_pv_domain())
@@ -207,7 +209,7 @@ static void __init xen_hvm_guest_init(void)
 
 	xen_panic_handler_init();
 
-	if (xen_feature(XENFEAT_hvm_callback_vector))
+	if (!no_vector_callback && xen_feature(XENFEAT_hvm_callback_vector))
 		xen_have_vector_callback = 1;
 
 	xen_hvm_smp_init();
@@ -233,6 +235,13 @@ static __init int xen_parse_nopv(char *arg)
 }
 early_param("xen_nopv", xen_parse_nopv);
 
+static __init int xen_parse_no_vector_callback(char *arg)
+{
+	no_vector_callback = true;
+	return 0;
+}
+early_param("no_vector_callback", xen_parse_no_vector_callback);
+
 bool __init xen_hvm_need_lapic(void)
 {
 	if (xen_pv_domain())
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 6038c4c35db5..bbebe248b726 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -2010,16 +2010,6 @@ static struct irq_chip xen_percpu_chip __read_mostly = {
 	.irq_ack		= ack_dynirq,
 };
 
-int xen_set_callback_via(uint64_t via)
-{
-	struct xen_hvm_param a;
-	a.domid = DOMID_SELF;
-	a.index = HVM_PARAM_CALLBACK_IRQ;
-	a.value = via;
-	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
-}
-EXPORT_SYMBOL_GPL(xen_set_callback_via);
-
 #ifdef CONFIG_XEN_PVHVM
 /* Vector callbacks are better than PCI interrupts to receive event
  * channel notifications because we can receive vector callbacks on any
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
index dd911e1ff782..5c3015a90a73 100644
--- a/drivers/xen/platform-pci.c
+++ b/drivers/xen/platform-pci.c
@@ -132,6 +132,13 @@ static int platform_pci_probe(struct pci_dev *pdev,
 			dev_warn(&pdev->dev, "request_irq failed err=%d\n", ret);
 			goto out;
 		}
+		/*
+		 * It doesn't strictly *have* to run on CPU0 but it sure
+		 * as hell better process the event channel ports delivered
+		 * to CPU0.
+		 */
+		irq_set_affinity(pdev->irq, cpumask_of(0));
+
 		callback_via = get_callback_via(pdev);
 		ret = xen_set_callback_via(callback_via);
 		if (ret) {
diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
index 5f5b8a7d5b80..05bbda51103f 100644
--- a/drivers/xen/xenbus/xenbus.h
+++ b/drivers/xen/xenbus/xenbus.h
@@ -113,6 +113,7 @@ int xenbus_probe_node(struct xen_bus_type *bus,
 		      const char *type,
 		      const char *nodename);
 int xenbus_probe_devices(struct xen_bus_type *bus);
+void xenbus_probe(void);
 
 void xenbus_dev_changed(const char *node, struct xen_bus_type *bus);
 
diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c
index eb5151fc8efa..e5fda0256feb 100644
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -57,16 +57,8 @@ DEFINE_MUTEX(xs_response_mutex);
 static int xenbus_irq;
 static struct task_struct *xenbus_task;
 
-static DECLARE_WORK(probe_work, xenbus_probe);
-
-
 static irqreturn_t wake_waiting(int irq, void *unused)
 {
-	if (unlikely(xenstored_ready == 0)) {
-		xenstored_ready = 1;
-		schedule_work(&probe_work);
-	}
-
 	wake_up(&xb_waitq);
 	return IRQ_HANDLED;
 }
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 38725d97d909..876f381b100a 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -682,29 +682,63 @@ void unregister_xenstore_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL_GPL(unregister_xenstore_notifier);
 
-void xenbus_probe(struct work_struct *unused)
+void xenbus_probe(void)
 {
 	xenstored_ready = 1;
 
+	/*
+	 * In the HVM case, xenbus_init() deferred its call to
+	 * xs_init() in case callbacks were not operational yet.
+	 * So do it now.
+	 */
+	if (xen_store_domain_type == XS_HVM)
+		xs_init();
+
 	/* Notify others that xenstore is up */
 	blocking_notifier_call_chain(&xenstore_chain, 0, NULL);
 }
-EXPORT_SYMBOL_GPL(xenbus_probe);
 
 static int __init xenbus_probe_initcall(void)
 {
-	if (!xen_domain())
-		return -ENODEV;
-
-	if (xen_initial_domain() || xen_hvm_domain())
-		return 0;
+	/*
+	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
+	 * need to wait for the platform PCI device to come up, which is
+	 * the (XEN_PVPVM && !xen_have_vector_callback) case.
+	 */
+	if (xen_store_domain_type == XS_PV ||
+	    (xen_store_domain_type == XS_HVM &&
+	     (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback)))
+		xenbus_probe();
 
-	xenbus_probe(NULL);
 	return 0;
 }
-
 device_initcall(xenbus_probe_initcall);
 
+int xen_set_callback_via(uint64_t via)
+{
+	struct xen_hvm_param a;
+	int ret;
+
+	a.domid = DOMID_SELF;
+	a.index = HVM_PARAM_CALLBACK_IRQ;
+	a.value = via;
+
+	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a);
+	if (ret)
+		return ret;
+
+	/*
+	 * If xenbus_probe_initcall() deferred the xenbus_probe()
+	 * due to the callback not functioning yet, we can do it now.
+	 */
+	if (!xenstored_ready && xen_store_domain_type == XS_HVM &&
+	    IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback)
+		xenbus_probe();
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(xen_set_callback_via);
+
 /* Set up event channel for xenstored which is run as a local process
  * (this is normally used only in dom0)
  */
@@ -817,11 +851,17 @@ static int __init xenbus_init(void)
 		break;
 	}
 
-	/* Initialize the interface to xenstore. */
-	err = xs_init();
-	if (err) {
-		pr_warn("Error initializing xenstore comms: %i\n", err);
-		goto out_error;
+	/*
+	 * HVM domains may not have a functional callback yet. In that
+	 * case let xs_init() be called from xenbus_probe(), which will
+	 * get invoked at an appropriate time.
+	 */
+	if (xen_store_domain_type != XS_HVM) {
+		err = xs_init();
+		if (err) {
+			pr_warn("Error initializing xenstore comms: %i\n", err);
+			goto out_error;
+		}
 	}
 
 	if ((xen_store_domain_type != XS_LOCAL) &&
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 5a8315e6d8a6..61202c83d560 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -183,7 +183,7 @@ void xs_suspend_cancel(void);
 
 struct work_struct;
 
-void xenbus_probe(struct work_struct *);
+void xenbus_probe(void);
 
 #define XENBUS_IS_ERR_READ(str) ({			\
 	if (!IS_ERR(str) && strlen(str) == 0) {		\
-- 
2.26.2

Re: [PATCH] xen: Fix event channel callback via INTX/GSI
Posted by boris.ostrovsky@oracle.com 3 years, 3 months ago
On 12/18/20 11:25 AM, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> For a while, event channel notification via the PCI platform device
> has been broken, because we attempt to communicate with xenstore before
> we even have notifications working, in the xs_reset_watches() call
> called from xs_init().
>
> This mostly doesn't matter much in practice because for Xen versions
> below 4.0 we avoid xs_reset_watches() anyway, because xenstore might
> not cope with reading a non-existent key. And newer Xen *does* have
> the vector callback support, so we rarely fall back to INTX/GSI
> delivery.
>
> But those working on Xen and Xen-compatible hypervisor implementations
> do want to test that INTX/GSI delivery works correctly, as there are
> still guest kernels in active use which don't use vector delivery yet.
> So it's useful to have it actually working.


Are there other cases where this would be useful? If it's just to test a hypervisor under development I would think that people who are doing that kind of work are capable of building their own kernel. My concern is mostly about having yet another boot option that is of interest to very few people who can easily work around not having it.


>
> To that end, clean up a bit of the mess of xs_init() and xenbus_probe()
> startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM
> case, deferring it to be called from xenbus_probe() in the XS_HVM case
> instead.
>
> Then fix up the invocation of xenbus_probe() to happen either from its
> device_initcall if the callback is available early enough, or when the
> callback is finally set up. This means that the hack of calling
> xenbus_probe() from a workqueue after the first interrupt is no longer
> needed.
>
> Add a 'no_vector_callback' command line argument to test it.


This last one should be a separate patch I think.


>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/arm/xen/enlighten.c          |  2 +-
>  arch/x86/xen/enlighten_hvm.c      | 11 ++++-
>  drivers/xen/events/events_base.c  | 10 -----
>  drivers/xen/platform-pci.c        |  7 ++++
>  drivers/xen/xenbus/xenbus.h       |  1 +
>  drivers/xen/xenbus/xenbus_comms.c |  8 ----
>  drivers/xen/xenbus/xenbus_probe.c | 68 ++++++++++++++++++++++++-------
>  include/xen/xenbus.h              |  2 +-
>  8 files changed, 74 insertions(+), 35 deletions(-)
>
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 60e901cd0de6..5a957a9a0984 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -371,7 +371,7 @@ static int __init xen_guest_init(void)
>  	}
>  	gnttab_init();
>  	if (!xen_initial_domain())
> -		xenbus_probe(NULL);
> +		xenbus_probe();
>  
>  	/*
>  	 * Making sure board specific code will not set up ops for
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index 9e87ab010c82..a1c07e0c888e 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -188,6 +188,8 @@ static int xen_cpu_dead_hvm(unsigned int cpu)
>         return 0;
>  }
>  
> +static bool no_vector_callback __initdata;
> +
>  static void __init xen_hvm_guest_init(void)
>  {
>  	if (xen_pv_domain())
> @@ -207,7 +209,7 @@ static void __init xen_hvm_guest_init(void)
>  
>  	xen_panic_handler_init();
>  
> -	if (xen_feature(XENFEAT_hvm_callback_vector))
> +	if (!no_vector_callback && xen_feature(XENFEAT_hvm_callback_vector))
>  		xen_have_vector_callback = 1;
>  
>  	xen_hvm_smp_init();
> @@ -233,6 +235,13 @@ static __init int xen_parse_nopv(char *arg)
>  }
>  early_param("xen_nopv", xen_parse_nopv);
>  
> +static __init int xen_parse_no_vector_callback(char *arg)
> +{
> +	no_vector_callback = true;
> +	return 0;
> +}
> +early_param("no_vector_callback", xen_parse_no_vector_callback);
> +
>  bool __init xen_hvm_need_lapic(void)
>  {
>  	if (xen_pv_domain())
> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
> index 6038c4c35db5..bbebe248b726 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -2010,16 +2010,6 @@ static struct irq_chip xen_percpu_chip __read_mostly = {
>  	.irq_ack		= ack_dynirq,
>  };
>  
> -int xen_set_callback_via(uint64_t via)
> -{
> -	struct xen_hvm_param a;
> -	a.domid = DOMID_SELF;
> -	a.index = HVM_PARAM_CALLBACK_IRQ;
> -	a.value = via;
> -	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
> -}
> -EXPORT_SYMBOL_GPL(xen_set_callback_via);
> -
>  #ifdef CONFIG_XEN_PVHVM
>  /* Vector callbacks are better than PCI interrupts to receive event
>   * channel notifications because we can receive vector callbacks on any
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> index dd911e1ff782..5c3015a90a73 100644
> --- a/drivers/xen/platform-pci.c
> +++ b/drivers/xen/platform-pci.c
> @@ -132,6 +132,13 @@ static int platform_pci_probe(struct pci_dev *pdev,
>  			dev_warn(&pdev->dev, "request_irq failed err=%d\n", ret);
>  			goto out;
>  		}
> +		/*
> +		 * It doesn't strictly *have* to run on CPU0 but it sure
> +		 * as hell better process the event channel ports delivered
> +		 * to CPU0.
> +		 */
> +		irq_set_affinity(pdev->irq, cpumask_of(0));
> +


Is the concern here that it won't be handled at all?


And is this related to the issue this patch is addressing?


>  		callback_via = get_callback_via(pdev);
>  		ret = xen_set_callback_via(callback_via);
>  		if (ret) {
> diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
> index 5f5b8a7d5b80..05bbda51103f 100644
> --- a/drivers/xen/xenbus/xenbus.h
> +++ b/drivers/xen/xenbus/xenbus.h
> @@ -113,6 +113,7 @@ int xenbus_probe_node(struct xen_bus_type *bus,
>  		      const char *type,
>  		      const char *nodename);
>  int xenbus_probe_devices(struct xen_bus_type *bus);
> +void xenbus_probe(void);
>  
>  void xenbus_dev_changed(const char *node, struct xen_bus_type *bus);
>  
> diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c
> index eb5151fc8efa..e5fda0256feb 100644
> --- a/drivers/xen/xenbus/xenbus_comms.c
> +++ b/drivers/xen/xenbus/xenbus_comms.c
> @@ -57,16 +57,8 @@ DEFINE_MUTEX(xs_response_mutex);
>  static int xenbus_irq;
>  static struct task_struct *xenbus_task;
>  
> -static DECLARE_WORK(probe_work, xenbus_probe);
> -
> -
>  static irqreturn_t wake_waiting(int irq, void *unused)
>  {
> -	if (unlikely(xenstored_ready == 0)) {
> -		xenstored_ready = 1;
> -		schedule_work(&probe_work);
> -	}
> -
>  	wake_up(&xb_waitq);
>  	return IRQ_HANDLED;
>  }
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index 38725d97d909..876f381b100a 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -682,29 +682,63 @@ void unregister_xenstore_notifier(struct notifier_block *nb)
>  }
>  EXPORT_SYMBOL_GPL(unregister_xenstore_notifier);
>  
> -void xenbus_probe(struct work_struct *unused)
> +void xenbus_probe(void)
>  {
>  	xenstored_ready = 1;
>  
> +	/*
> +	 * In the HVM case, xenbus_init() deferred its call to
> +	 * xs_init() in case callbacks were not operational yet.
> +	 * So do it now.
> +	 */
> +	if (xen_store_domain_type == XS_HVM)
> +		xs_init();
> +
>  	/* Notify others that xenstore is up */
>  	blocking_notifier_call_chain(&xenstore_chain, 0, NULL);
>  }
> -EXPORT_SYMBOL_GPL(xenbus_probe);
>  
>  static int __init xenbus_probe_initcall(void)
>  {
> -	if (!xen_domain())
> -		return -ENODEV;
> -
> -	if (xen_initial_domain() || xen_hvm_domain())
> -		return 0;
> +	/*
> +	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
> +	 * need to wait for the platform PCI device to come up, which is
> +	 * the (XEN_PVPVM && !xen_have_vector_callback) case.
> +	 */
> +	if (xen_store_domain_type == XS_PV ||
> +	    (xen_store_domain_type == XS_HVM &&
> +	     (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback)))
> +		xenbus_probe();
>  
> -	xenbus_probe(NULL);
>  	return 0;
>  }
> -
>  device_initcall(xenbus_probe_initcall);
>  
> +int xen_set_callback_via(uint64_t via)
> +{
> +	struct xen_hvm_param a;
> +	int ret;
> +
> +	a.domid = DOMID_SELF;
> +	a.index = HVM_PARAM_CALLBACK_IRQ;
> +	a.value = via;
> +
> +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * If xenbus_probe_initcall() deferred the xenbus_probe()
> +	 * due to the callback not functioning yet, we can do it now.
> +	 */
> +	if (!xenstored_ready && xen_store_domain_type == XS_HVM &&
> +	    IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback)


I'd create an is_callback_ready() (or something with a better name) helper.


> +		xenbus_probe();
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(xen_set_callback_via);
> +
>  /* Set up event channel for xenstored which is run as a local process
>   * (this is normally used only in dom0)
>   */
> @@ -817,11 +851,17 @@ static int __init xenbus_init(void)
>  		break;
>  	}
>  
> -	/* Initialize the interface to xenstore. */
> -	err = xs_init();
> -	if (err) {
> -		pr_warn("Error initializing xenstore comms: %i\n", err);
> -		goto out_error;
> +	/*
> +	 * HVM domains may not have a functional callback yet. In that
> +	 * case let xs_init() be called from xenbus_probe(), which will
> +	 * get invoked at an appropriate time.
> +	 */
> +	if (xen_store_domain_type != XS_HVM) {


Can we delay xs_init() for !XS_HVM as well? In other words wait until after PCI platform device has been probed (on HVM) and then call xs_init() for everyone.


-boris


> +		err = xs_init();
> +		if (err) {
> +			pr_warn("Error initializing xenstore comms: %i\n", err);
> +			goto out_error;
> +		}
>  	}
>  
>  	if ((xen_store_domain_type != XS_LOCAL) &&
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 5a8315e6d8a6..61202c83d560 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -183,7 +183,7 @@ void xs_suspend_cancel(void);
>  
>  struct work_struct;
>  
> -void xenbus_probe(struct work_struct *);
> +void xenbus_probe(void);
>  
>  #define XENBUS_IS_ERR_READ(str) ({			\
>  	if (!IS_ERR(str) && strlen(str) == 0) {		\

Re: [PATCH] xen: Fix event channel callback via INTX/GSI
Posted by David Woodhouse 3 years, 3 months ago
On Fri, 2020-12-18 at 17:20 -0500, boris.ostrovsky@oracle.com wrote:
> Are there other cases where this would be useful? If it's just to
> test a hypervisor under development I would think that people who are
> doing that kind of work are capable of building their own kernel. My
> concern is mostly about having yet another boot option that is of
> interest to very few people who can easily work around not having it.

For hypervisor testing we can just set the Xen major version number in
CPUID to 3, and that stops xs_reset_watches() from doing anything.

cf. https://lkml.org/lkml/2017/4/10/266

Karim ripped out all this INTX code in 2017 because it was broken, and
subsequently put it back because it *was* working for older versions of
Xen, due to that "coincidence". The conclusion back then was that if it
was put back it should at least *work* consistently, and he was going
to send a patch "shortly". This is a that patch :)

> > Add a 'no_vector_callback' command line argument to test it.
>
> This last one should be a separate patch I think.

Could do.

> > +		/*
> > +		 * It doesn't strictly *have* to run on CPU0 but it sure
> > +		 * as hell better process the event channel ports delivered
> > +		 * to CPU0.
> > +		 */
> > +		irq_set_affinity(pdev->irq, cpumask_of(0));
> > +
> 
> 
> Is the concern here that it won't be handled at all?

Indeed, the events don't get handled at all if the PCI interrupt lands
on a CPU other than zero. When the handler calls
xen_hvm_evtchn_do_upcall() that processes pending events for whichever
CPU it happens to be running on, and *not* the events pending for CPU0.
And the boot stops in xs_reset_watches() waiting (without a timeout)
for an interrupt that never gets processed, as before.

> And is this related to the issue this patch is addressing?

It is required to fix the event channel callback via INTX/GSI, yes.
Although it could reasonably be lifted out into a separate patch too.

> >  static int __init xenbus_probe_initcall(void)
> >  {
> > -	if (!xen_domain())
> > -		return -ENODEV;
> > -
> > -	if (xen_initial_domain() || xen_hvm_domain())
> > -		return 0;
> > +	/*
> > +	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
> > +	 * need to wait for the platform PCI device to come up, which is
> > +	 * the (XEN_PVPVM && !xen_have_vector_callback) case.
> > +	 */
> > +	if (xen_store_domain_type == XS_PV ||
> > +	    (xen_store_domain_type == XS_HVM &&
> > +	     (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback)))
> > +		xenbus_probe();
> >  
> > -	xenbus_probe(NULL);
> >  	return 0;
> >  }
> > -
> >  device_initcall(xenbus_probe_initcall);
> >  
> > +int xen_set_callback_via(uint64_t via)
> > +{
> > +	struct xen_hvm_param a;
> > +	int ret;
> > +
> > +	a.domid = DOMID_SELF;
> > +	a.index = HVM_PARAM_CALLBACK_IRQ;
> > +	a.value = via;
> > +
> > +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/*
> > +	 * If xenbus_probe_initcall() deferred the xenbus_probe()
> > +	 * due to the callback not functioning yet, we can do it now.
> > +	 */
> > +	if (!xenstored_ready && xen_store_domain_type == XS_HVM &&
> > +	    IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback)
> 
> 
> I'd create an is_callback_ready() (or something with a better name)
> helper.

I pondered that, and indeed dropping the XVM/vector conditions and
doing it literally based on whether xen_set_callback_via() had been
called at all (and not too early). But it looks like there are cases
where Arm doesn't call xen_set_callback_via() at all, and it made more
sense to me to live xen_set_callback_via() to sit right here and have
those two conditions within a page of each other in juxtaposition, with
suitable comments. I think that's probably easier to understand and
work with than a "helper".

> 
> > +		xenbus_probe();
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(xen_set_callback_via);
> > +
> >  /* Set up event channel for xenstored which is run as a local process
> >   * (this is normally used only in dom0)
> >   */
> > @@ -817,11 +851,17 @@ static int __init xenbus_init(void)
> >  		break;
> >  	}
> >  
> > -	/* Initialize the interface to xenstore. */
> > -	err = xs_init();
> > -	if (err) {
> > -		pr_warn("Error initializing xenstore comms: %i\n", err);
> > -		goto out_error;
> > +	/*
> > +	 * HVM domains may not have a functional callback yet. In that
> > +	 * case let xs_init() be called from xenbus_probe(), which will
> > +	 * get invoked at an appropriate time.
> > +	 */
> > +	if (xen_store_domain_type != XS_HVM) {
> 
> 
> Can we delay xs_init() for !XS_HVM as well? In other words wait until
> after PCI platform device has been probed (on HVM) and then call
> xs_init() for everyone.

We're half-way there already, because xenbus_probe() *does* happen
later as a device_initcall, and I've just made it call xs_init().

We could make it avoid calling xs_init() from xenbus_init() in the
XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it.

But right now xenbus_probe() doesn't run for the other cases, so
there'd have to be a mode where it *only* calls xs_init() and doesn't
do the notifier chain. That seems like more churn that was needed so I
didn't do it.

Re: [PATCH] xen: Fix event channel callback via INTX/GSI
Posted by boris.ostrovsky@oracle.com 3 years, 3 months ago
On 12/19/20 3:05 AM, David Woodhouse wrote:
> On Fri, 2020-12-18 at 17:20 -0500, boris.ostrovsky@oracle.com wrote:
>> Are there other cases where this would be useful? If it's just to
>> test a hypervisor under development I would think that people who are
>> doing that kind of work are capable of building their own kernel. My
>> concern is mostly about having yet another boot option that is of
>> interest to very few people who can easily work around not having it.
> For hypervisor testing we can just set the Xen major version number in
> CPUID to 3, and that stops xs_reset_watches() from doing anything.
>
> cf. https://lkml.org/lkml/2017/4/10/266
>
> Karim ripped out all this INTX code in 2017 because it was broken, and
> subsequently put it back because it *was* working for older versions of
> Xen, due to that "coincidence". The conclusion back then was that if it
> was put back it should at least *work* consistently, and he was going
> to send a patch "shortly". This is a that patch :)


Right, I am not arguing about usefulness of the fix, only of the new boot option.


>
>>> Add a 'no_vector_callback' command line argument to test it.
>> This last one should be a separate patch I think.
> Could do.
>
>>> +		/*
>>> +		 * It doesn't strictly *have* to run on CPU0 but it sure
>>> +		 * as hell better process the event channel ports delivered
>>> +		 * to CPU0.
>>> +		 */
>>> +		irq_set_affinity(pdev->irq, cpumask_of(0));
>>> +
>>
>> Is the concern here that it won't be handled at all?
> Indeed, the events don't get handled at all if the PCI interrupt lands
> on a CPU other than zero. When the handler calls
> xen_hvm_evtchn_do_upcall() that processes pending events for whichever
> CPU it happens to be running on, and *not* the events pending for CPU0.
> And the boot stops in xs_reset_watches() waiting (without a timeout)
> for an interrupt that never gets processed, as before.


Yes, I see. Then please do it in a separate patch.


>
>> And is this related to the issue this patch is addressing?
> It is required to fix the event channel callback via INTX/GSI, yes.
> Although it could reasonably be lifted out into a separate patch too.
>
>>>  static int __init xenbus_probe_initcall(void)
>>>  {
>>> -	if (!xen_domain())
>>> -		return -ENODEV;
>>> -
>>> -	if (xen_initial_domain() || xen_hvm_domain())
>>> -		return 0;
>>> +	/*
>>> +	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
>>> +	 * need to wait for the platform PCI device to come up, which is
>>> +	 * the (XEN_PVPVM && !xen_have_vector_callback) case.
>>> +	 */
>>> +	if (xen_store_domain_type == XS_PV ||
>>> +	    (xen_store_domain_type == XS_HVM &&
>>> +	     (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback)))
>>> +		xenbus_probe();
>>>  
>>> -	xenbus_probe(NULL);
>>>  	return 0;
>>>  }
>>> -
>>>  device_initcall(xenbus_probe_initcall);
>>>  
>>> +int xen_set_callback_via(uint64_t via)
>>> +{
>>> +	struct xen_hvm_param a;
>>> +	int ret;
>>> +
>>> +	a.domid = DOMID_SELF;
>>> +	a.index = HVM_PARAM_CALLBACK_IRQ;
>>> +	a.value = via;
>>> +
>>> +	ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	/*
>>> +	 * If xenbus_probe_initcall() deferred the xenbus_probe()
>>> +	 * due to the callback not functioning yet, we can do it now.
>>> +	 */
>>> +	if (!xenstored_ready && xen_store_domain_type == XS_HVM &&
>>> +	    IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback)
>>
>> I'd create an is_callback_ready() (or something with a better name)
>> helper.
> I pondered that, and indeed dropping the XVM/vector conditions and
> doing it literally based on whether xen_set_callback_via() had been
> called at all (and not too early). But it looks like there are cases
> where Arm doesn't call xen_set_callback_via() at all, and it made more
> sense to me to live xen_set_callback_via() to sit right here and have
> those two conditions within a page of each other in juxtaposition, with
> suitable comments. I think that's probably easier to understand and
> work with than a "helper".


OK.


>
>>> +		xenbus_probe();
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(xen_set_callback_via);
>>> +
>>>  /* Set up event channel for xenstored which is run as a local process
>>>   * (this is normally used only in dom0)
>>>   */
>>> @@ -817,11 +851,17 @@ static int __init xenbus_init(void)
>>>  		break;
>>>  	}
>>>  
>>> -	/* Initialize the interface to xenstore. */
>>> -	err = xs_init();
>>> -	if (err) {
>>> -		pr_warn("Error initializing xenstore comms: %i\n", err);
>>> -		goto out_error;
>>> +	/*
>>> +	 * HVM domains may not have a functional callback yet. In that
>>> +	 * case let xs_init() be called from xenbus_probe(), which will
>>> +	 * get invoked at an appropriate time.
>>> +	 */
>>> +	if (xen_store_domain_type != XS_HVM) {
>>
>> Can we delay xs_init() for !XS_HVM as well? In other words wait until
>> after PCI platform device has been probed (on HVM) and then call
>> xs_init() for everyone.
> We're half-way there already, because xenbus_probe() *does* happen
> later as a device_initcall, and I've just made it call xs_init().
>
> We could make it avoid calling xs_init() from xenbus_init() in the
> XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it.


Yes, that's along the lines of what I was thinking.


>
> But right now xenbus_probe() doesn't run for the other cases, so
> there'd have to be a mode where it *only* calls xs_init() and doesn't
> do the notifier chain. That seems like more churn that was needed so I
> didn't do it.


You think so? Yes, there would be a couple more places where you'd need to call xenbus_probe() but then you won't have to explain (with comments) why you call xs_init() here and not there and vice versa. It just looks to me a bit more complicated the way you do this but I suppose it's a matter of personal preference.


-boris


Re: [PATCH] xen: Fix event channel callback via INTX/GSI
Posted by David Woodhouse 3 years, 3 months ago
On Sun, 2020-12-20 at 13:01 -0500, boris.ostrovsky@oracle.com wrote:
> On 12/19/20 3:05 AM, David Woodhouse wrote:
> > On Fri, 2020-12-18 at 17:20 -0500, boris.ostrovsky@oracle.com wrote:
> > > Are there other cases where this would be useful? If it's just to
> > > test a hypervisor under development I would think that people who are
> > > doing that kind of work are capable of building their own kernel. My
> > > concern is mostly about having yet another boot option that is of
> > > interest to very few people who can easily work around not having it.
> > 
> > For hypervisor testing we can just set the Xen major version number in
> > CPUID to 3, and that stops xs_reset_watches() from doing anything.
> > 
> > cf. https://lkml.org/lkml/2017/4/10/266
> > 
> > Karim ripped out all this INTX code in 2017 because it was broken, and
> > subsequently put it back because it *was* working for older versions of
> > Xen, due to that "coincidence". The conclusion back then was that if it
> > was put back it should at least *work* consistently, and he was going
> > to send a patch "shortly". This is a that patch :)
> 
> 
> Right, I am not arguing about usefulness of the fix, only of the new boot option.

The boot option also makes it easier for Linux developers to actually
test this.

It's all very well for someone who already has the hypervisor code open
in an emacs buffer, who can turn off vector support then run the canned
test case which spins up that hypervisor and runs Linux nested inside
to. But most people don't live like that, and a command line option
that Linux can use and just run in a normal Xen environment to test
these code paths is kind of useful.


> > > Can we delay xs_init() for !XS_HVM as well? In other words wait until
> > > after PCI platform device has been probed (on HVM) and then call
> > > xs_init() for everyone.
> > 
> > We're half-way there already, because xenbus_probe() *does* happen
> > later as a device_initcall, and I've just made it call xs_init().
> > 
> > We could make it avoid calling xs_init() from xenbus_init() in the
> > XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it.
> 
> 
> Yes, that's along the lines of what I was thinking.
> 
> 
> > 
> > But right now xenbus_probe() doesn't run for the other cases, so
> > there'd have to be a mode where it *only* calls xs_init() and doesn't
> > do the notifier chain. That seems like more churn that was needed so I
> > didn't do it.
> 
> 
> You think so? Yes, there would be a couple more places where you'd
> need to call xenbus_probe() but then you won't have to explain (with
> comments) why you call xs_init() here and not there and vice versa.
> It just looks to me a bit more complicated the way you do this but I
> suppose it's a matter of personal preference.

I suspect it probably just moves things around and leaves *different*
things to explain (with comments!).

I'll split it out into separate patches (and also fix up the fact that
we're registering the IPI and spinlock event channels even though we're
not going to use them).