[LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain

Stefano Stabellini posted 1 patch 1 year, 12 months ago
Failed in applying to current master (apply log)
drivers/xen/xenbus/xenbus_probe.c  | 86 +++++++++++++++++++++++-------
include/xen/interface/io/xs_wire.h |  3 ++
2 files changed, 70 insertions(+), 19 deletions(-)
[LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain
Posted by Stefano Stabellini 1 year, 12 months ago
From: Luca Miccio <lucmiccio@gmail.com>

When running as dom0less guest (HVM domain on ARM) the xenstore event
channel is available at domain creation but the shared xenstore
interface page only becomes available later on.

In that case, wait for a notification on the xenstore event channel,
then complete the xenstore initialization later, when the shared page
is actually available.

The xenstore page has few extra field. Add them to the shared struct.
One of the field is "connection", when the connection is ready, it is
zero. If the connection is not-zero, wait for a notification.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: jgross@suse.com
CC: boris.ostrovsky@oracle.com
---
Changes in v3:
- check for the connection field, if it is not zero, wait for event

Changes in v2:
- remove XENFEAT_xenstore_late_init
---
 drivers/xen/xenbus/xenbus_probe.c  | 86 +++++++++++++++++++++++-------
 include/xen/interface/io/xs_wire.h |  3 ++
 2 files changed, 70 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index fe360c33ce71..dc046d25789e 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -65,6 +65,7 @@
 #include "xenbus.h"
 
 
+static int xs_init_irq;
 int xen_store_evtchn;
 EXPORT_SYMBOL_GPL(xen_store_evtchn);
 
@@ -750,6 +751,17 @@ static void xenbus_probe(void)
 {
 	xenstored_ready = 1;
 
+	if (!xen_store_interface) {
+		xen_store_interface = xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
+						XEN_PAGE_SIZE);
+		/*
+		 * Now it is safe to free the IRQ used for xenstore late
+		 * initialization. No need to unbind: it is about to be
+		 * bound again.
+		 */
+		free_irq(xs_init_irq, &xb_waitq);
+	}
+
 	/*
 	 * In the HVM case, xenbus_init() deferred its call to
 	 * xs_init() in case callbacks were not operational yet.
@@ -798,20 +810,22 @@ static int __init xenbus_probe_initcall(void)
 {
 	/*
 	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
-	 * need to wait for the platform PCI device to come up.
+	 * need to wait for the platform PCI device to come up or
+	 * xen_store_interface is not ready.
 	 */
 	if (xen_store_domain_type == XS_PV ||
 	    (xen_store_domain_type == XS_HVM &&
-	     !xs_hvm_defer_init_for_callback()))
+	     !xs_hvm_defer_init_for_callback() &&
+	     xen_store_interface != NULL))
 		xenbus_probe();
 
 	/*
-	 * For XS_LOCAL, spawn a thread which will wait for xenstored
-	 * or a xenstore-stubdom to be started, then probe. It will be
-	 * triggered when communication starts happening, by waiting
-	 * on xb_waitq.
+	 * For XS_LOCAL or when xen_store_interface is not ready, spawn a
+	 * thread which will wait for xenstored or a xenstore-stubdom to be
+	 * started, then probe.  It will be triggered when communication
+	 * starts happening, by waiting on xb_waitq.
 	 */
-	if (xen_store_domain_type == XS_LOCAL) {
+	if (xen_store_domain_type == XS_LOCAL || xen_store_interface == NULL) {
 		struct task_struct *probe_task;
 
 		probe_task = kthread_run(xenbus_probe_thread, NULL,
@@ -907,10 +921,25 @@ static struct notifier_block xenbus_resume_nb = {
 	.notifier_call = xenbus_resume_cb,
 };
 
+static irqreturn_t xenbus_late_init(int irq, void *unused)
+{
+	int err = 0;
+	uint64_t v = 0;
+
+	err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
+	if (err || !v || !~v)
+		return IRQ_HANDLED;
+	xen_store_gfn = (unsigned long)v;
+
+	wake_up(&xb_waitq);
+	return IRQ_HANDLED;
+}
+
 static int __init xenbus_init(void)
 {
 	int err;
 	uint64_t v = 0;
+	bool wait = false;
 	xen_store_domain_type = XS_UNKNOWN;
 
 	if (!xen_domain())
@@ -959,23 +988,42 @@ static int __init xenbus_init(void)
 		 *
 		 * Also recognize all bits set as an invalid value.
 		 */
-		if (!v || !~v) {
+		if (!v) {
 			err = -ENOENT;
 			goto out_error;
 		}
-		/* Avoid truncation on 32-bit. */
+		if (v == ~0ULL) {
+			wait = true;
+		} else {
+			/* Avoid truncation on 32-bit. */
 #if BITS_PER_LONG == 32
-		if (v > ULONG_MAX) {
-			pr_err("%s: cannot handle HVM_PARAM_STORE_PFN=%llx > ULONG_MAX\n",
-			       __func__, v);
-			err = -EINVAL;
-			goto out_error;
-		}
+			if (v > ULONG_MAX) {
+				pr_err("%s: cannot handle HVM_PARAM_STORE_PFN=%llx > ULONG_MAX\n",
+						__func__, v);
+				err = -EINVAL;
+				goto out_error;
+			}
 #endif
-		xen_store_gfn = (unsigned long)v;
-		xen_store_interface =
-			xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
-				  XEN_PAGE_SIZE);
+			xen_store_gfn = (unsigned long)v;
+			xen_store_interface =
+				xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
+					  XEN_PAGE_SIZE);
+			if (xen_store_interface->connection != 0)
+				wait = true;
+		}
+		if (wait) {
+			err = bind_evtchn_to_irqhandler(xen_store_evtchn,
+							xenbus_late_init,
+							0, "xenstore_late_init",
+							&xb_waitq);
+			if (err < 0) {
+				pr_err("xenstore_late_init couldn't bind irq err=%d\n",
+				       err);
+				return err;
+			}
+
+			xs_init_irq = err;
+		}
 		break;
 	default:
 		pr_warn("Xenstore state unknown\n");
diff --git a/include/xen/interface/io/xs_wire.h b/include/xen/interface/io/xs_wire.h
index d40a44f09b16..cd7ae5ebb133 100644
--- a/include/xen/interface/io/xs_wire.h
+++ b/include/xen/interface/io/xs_wire.h
@@ -87,6 +87,9 @@ struct xenstore_domain_interface {
     char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
     XENSTORE_RING_IDX req_cons, req_prod;
     XENSTORE_RING_IDX rsp_cons, rsp_prod;
+    uint32_t server_features; /* Bitmap of features supported by the server */
+    uint32_t connection;
+    uint32_t error;
 };
 
 /* Violating this is very bad.  See docs/misc/xenstore.txt. */
-- 
2.25.1
Re: [LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain
Posted by Juergen Gross 1 year, 12 months ago
On 29.04.22 23:10, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> When running as dom0less guest (HVM domain on ARM) the xenstore event
> channel is available at domain creation but the shared xenstore
> interface page only becomes available later on.
> 
> In that case, wait for a notification on the xenstore event channel,
> then complete the xenstore initialization later, when the shared page
> is actually available.
> 
> The xenstore page has few extra field. Add them to the shared struct.
> One of the field is "connection", when the connection is ready, it is
> zero. If the connection is not-zero, wait for a notification.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> CC: jgross@suse.com
> CC: boris.ostrovsky@oracle.com
> ---
> Changes in v3:
> - check for the connection field, if it is not zero, wait for event
> 
> Changes in v2:
> - remove XENFEAT_xenstore_late_init
> ---
>   drivers/xen/xenbus/xenbus_probe.c  | 86 +++++++++++++++++++++++-------
>   include/xen/interface/io/xs_wire.h |  3 ++
>   2 files changed, 70 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index fe360c33ce71..dc046d25789e 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -65,6 +65,7 @@
>   #include "xenbus.h"
>   
>   
> +static int xs_init_irq;
>   int xen_store_evtchn;
>   EXPORT_SYMBOL_GPL(xen_store_evtchn);
>   
> @@ -750,6 +751,17 @@ static void xenbus_probe(void)
>   {
>   	xenstored_ready = 1;
>   
> +	if (!xen_store_interface) {
> +		xen_store_interface = xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
> +						XEN_PAGE_SIZE);
> +		/*
> +		 * Now it is safe to free the IRQ used for xenstore late
> +		 * initialization. No need to unbind: it is about to be
> +		 * bound again.
> +		 */
> +		free_irq(xs_init_irq, &xb_waitq);
> +	}
> +
>   	/*
>   	 * In the HVM case, xenbus_init() deferred its call to
>   	 * xs_init() in case callbacks were not operational yet.
> @@ -798,20 +810,22 @@ static int __init xenbus_probe_initcall(void)
>   {
>   	/*
>   	 * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
> -	 * need to wait for the platform PCI device to come up.
> +	 * need to wait for the platform PCI device to come up or
> +	 * xen_store_interface is not ready.
>   	 */
>   	if (xen_store_domain_type == XS_PV ||
>   	    (xen_store_domain_type == XS_HVM &&
> -	     !xs_hvm_defer_init_for_callback()))
> +	     !xs_hvm_defer_init_for_callback() &&
> +	     xen_store_interface != NULL))
>   		xenbus_probe();
>   
>   	/*
> -	 * For XS_LOCAL, spawn a thread which will wait for xenstored
> -	 * or a xenstore-stubdom to be started, then probe. It will be
> -	 * triggered when communication starts happening, by waiting
> -	 * on xb_waitq.
> +	 * For XS_LOCAL or when xen_store_interface is not ready, spawn a
> +	 * thread which will wait for xenstored or a xenstore-stubdom to be
> +	 * started, then probe.  It will be triggered when communication
> +	 * starts happening, by waiting on xb_waitq.
>   	 */
> -	if (xen_store_domain_type == XS_LOCAL) {
> +	if (xen_store_domain_type == XS_LOCAL || xen_store_interface == NULL) {
>   		struct task_struct *probe_task;
>   
>   		probe_task = kthread_run(xenbus_probe_thread, NULL,
> @@ -907,10 +921,25 @@ static struct notifier_block xenbus_resume_nb = {
>   	.notifier_call = xenbus_resume_cb,
>   };
>   
> +static irqreturn_t xenbus_late_init(int irq, void *unused)
> +{
> +	int err = 0;
> +	uint64_t v = 0;
> +
> +	err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
> +	if (err || !v || !~v)
> +		return IRQ_HANDLED;
> +	xen_store_gfn = (unsigned long)v;
> +
> +	wake_up(&xb_waitq);
> +	return IRQ_HANDLED;
> +}
> +
>   static int __init xenbus_init(void)
>   {
>   	int err;
>   	uint64_t v = 0;
> +	bool wait = false;
>   	xen_store_domain_type = XS_UNKNOWN;
>   
>   	if (!xen_domain())
> @@ -959,23 +988,42 @@ static int __init xenbus_init(void)
>   		 *
>   		 * Also recognize all bits set as an invalid value.
>   		 */
> -		if (!v || !~v) {
> +		if (!v) {
>   			err = -ENOENT;
>   			goto out_error;
>   		}
> -		/* Avoid truncation on 32-bit. */
> +		if (v == ~0ULL) {
> +			wait = true;
> +		} else {
> +			/* Avoid truncation on 32-bit. */
>   #if BITS_PER_LONG == 32
> -		if (v > ULONG_MAX) {
> -			pr_err("%s: cannot handle HVM_PARAM_STORE_PFN=%llx > ULONG_MAX\n",
> -			       __func__, v);
> -			err = -EINVAL;
> -			goto out_error;
> -		}
> +			if (v > ULONG_MAX) {
> +				pr_err("%s: cannot handle HVM_PARAM_STORE_PFN=%llx > ULONG_MAX\n",
> +						__func__, v);
> +				err = -EINVAL;
> +				goto out_error;
> +			}
>   #endif
> -		xen_store_gfn = (unsigned long)v;
> -		xen_store_interface =
> -			xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
> -				  XEN_PAGE_SIZE);
> +			xen_store_gfn = (unsigned long)v;
> +			xen_store_interface =
> +				xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
> +					  XEN_PAGE_SIZE);
> +			if (xen_store_interface->connection != 0)

Please use XENSTORE_CONNECTED instead of 0.

> +				wait = true;
> +		}
> +		if (wait) {
> +			err = bind_evtchn_to_irqhandler(xen_store_evtchn,
> +							xenbus_late_init,
> +							0, "xenstore_late_init",
> +							&xb_waitq);
> +			if (err < 0) {
> +				pr_err("xenstore_late_init couldn't bind irq err=%d\n",
> +				       err);
> +				return err;
> +			}
> +
> +			xs_init_irq = err;
> +		}
>   		break;
>   	default:
>   		pr_warn("Xenstore state unknown\n");
> diff --git a/include/xen/interface/io/xs_wire.h b/include/xen/interface/io/xs_wire.h
> index d40a44f09b16..cd7ae5ebb133 100644
> --- a/include/xen/interface/io/xs_wire.h
> +++ b/include/xen/interface/io/xs_wire.h
> @@ -87,6 +87,9 @@ struct xenstore_domain_interface {
>       char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
>       XENSTORE_RING_IDX req_cons, req_prod;
>       XENSTORE_RING_IDX rsp_cons, rsp_prod;
> +    uint32_t server_features; /* Bitmap of features supported by the server */
> +    uint32_t connection;
> +    uint32_t error;
>   };
>   
>   /* Violating this is very bad.  See docs/misc/xenstore.txt. */

Please do a full sync of the header. This will enable you to use the
correct defines instead of hard coded numbers.


Juergen
Re: [LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain
Posted by Boris Ostrovsky 1 year, 12 months ago
On 4/29/22 5:10 PM, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
>
> When running as dom0less guest (HVM domain on ARM) the xenstore event
> channel is available at domain creation but the shared xenstore
> interface page only becomes available later on.
>
> In that case, wait for a notification on the xenstore event channel,
> then complete the xenstore initialization later, when the shared page
> is actually available.
>
> The xenstore page has few extra field. Add them to the shared struct.
> One of the field is "connection", when the connection is ready, it is
> zero. If the connection is not-zero, wait for a notification.
>
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> CC: jgross@suse.com
> CC: boris.ostrovsky@oracle.com
> ---
> Changes in v3:
> - check for the connection field, if it is not zero, wait for event
>
> Changes in v2:
> - remove XENFEAT_xenstore_late_init
> ---
>   drivers/xen/xenbus/xenbus_probe.c  | 86 +++++++++++++++++++++++-------
>   include/xen/interface/io/xs_wire.h |  3 ++
>   2 files changed, 70 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index fe360c33ce71..dc046d25789e 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -65,6 +65,7 @@
>   #include "xenbus.h"
>   
>   
> +static int xs_init_irq;
>   int xen_store_evtchn;
>   EXPORT_SYMBOL_GPL(xen_store_evtchn);
>   
> @@ -750,6 +751,17 @@ static void xenbus_probe(void)
>   {
>   	xenstored_ready = 1;
>   
> +	if (!xen_store_interface) {
> +		xen_store_interface = xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
> +						XEN_PAGE_SIZE);
> +		/*
> +		 * Now it is safe to free the IRQ used for xenstore late
> +		 * initialization. No need to unbind: it is about to be
> +		 * bound again.


This assumes knowledge of bind/unbind internals. I think we should unbind.


> +		 */
> +		free_irq(xs_init_irq, &xb_waitq);
> +	}
> +



> @@ -959,23 +988,42 @@ static int __init xenbus_init(void)
>   		 *
>   		 * Also recognize all bits set as an invalid value.


Is this comment still correct?


>   		 */
> -		if (!v || !~v) {
> +		if (!v) {
>   			err = -ENOENT;
>   			goto out_error;
>   		}
> -		/* Avoid truncation on 32-bit. */
> +		if (v == ~0ULL) {
> +			wait = true;
> +		} else {
> +			/* Avoid truncation on 32-bit. */
>
Re: [LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain
Posted by Stefano Stabellini 1 year, 12 months ago
On Fri, 29 Apr 2022, Boris Ostrovsky wrote:
> On 4/29/22 5:10 PM, Stefano Stabellini wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> > 
> > When running as dom0less guest (HVM domain on ARM) the xenstore event
> > channel is available at domain creation but the shared xenstore
> > interface page only becomes available later on.
> > 
> > In that case, wait for a notification on the xenstore event channel,
> > then complete the xenstore initialization later, when the shared page
> > is actually available.
> > 
> > The xenstore page has few extra field. Add them to the shared struct.
> > One of the field is "connection", when the connection is ready, it is
> > zero. If the connection is not-zero, wait for a notification.
> > 
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > CC: jgross@suse.com
> > CC: boris.ostrovsky@oracle.com
> > ---
> > Changes in v3:
> > - check for the connection field, if it is not zero, wait for event
> > 
> > Changes in v2:
> > - remove XENFEAT_xenstore_late_init
> > ---
> >   drivers/xen/xenbus/xenbus_probe.c  | 86 +++++++++++++++++++++++-------
> >   include/xen/interface/io/xs_wire.h |  3 ++
> >   2 files changed, 70 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/xen/xenbus/xenbus_probe.c
> > b/drivers/xen/xenbus/xenbus_probe.c
> > index fe360c33ce71..dc046d25789e 100644
> > --- a/drivers/xen/xenbus/xenbus_probe.c
> > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > @@ -65,6 +65,7 @@
> >   #include "xenbus.h"
> >     +static int xs_init_irq;
> >   int xen_store_evtchn;
> >   EXPORT_SYMBOL_GPL(xen_store_evtchn);
> >   @@ -750,6 +751,17 @@ static void xenbus_probe(void)
> >   {
> >   	xenstored_ready = 1;
> >   +	if (!xen_store_interface) {
> > +		xen_store_interface = xen_remap(xen_store_gfn <<
> > XEN_PAGE_SHIFT,
> > +						XEN_PAGE_SIZE);
> > +		/*
> > +		 * Now it is safe to free the IRQ used for xenstore late
> > +		 * initialization. No need to unbind: it is about to be
> > +		 * bound again.
> 
> 
> This assumes knowledge of bind/unbind internals. I think we should unbind.

I gave it a try and there is a problem with unbinding the IRQ here. If I
do that, later when xb_init_comms calls bind_evtchn_to_irqhandler, the
event channel doesn't fire anymore. I did some testing and debugging and
as far as I can tell the issue is that if we call unbind_from_irqhandler
here, we end up calling xen_evtchn_close. Then, when xb_init_comms calls
bind_evtchn_to_irqhandler on the same evtchn, it is not enough to
receive event notifications anymore because from Xen POV the evtchn is
"closed".

If I comment out xen_evtchn_close() in __unbind_from_irq, then yes, I
can call unbind_from_irqhandler here instead of free_irq and everything
works.

My suggestion is to keep the call to free_irq here (not
unbind_from_irqhandler) and improve the in-code comment.

 
> > +		 */
> > +		free_irq(xs_init_irq, &xb_waitq);
> > +	}
> > +
> 
> 
> 
> > @@ -959,23 +988,42 @@ static int __init xenbus_init(void)
> >   		 *
> >   		 * Also recognize all bits set as an invalid value.
> 
> 
> Is this comment still correct?

I can improve the comment

 
> >   		 */
> > -		if (!v || !~v) {
> > +		if (!v) {
> >   			err = -ENOENT;
> >   			goto out_error;
> >   		}
> > -		/* Avoid truncation on 32-bit. */
> > +		if (v == ~0ULL) {
> > +			wait = true;
> > +		} else {
> > +			/* Avoid truncation on 32-bit. */
Re: [LINUX PATCH v3] xen: add support for initializing xenstore later as HVM domain
Posted by Boris Ostrovsky 1 year, 12 months ago
On 5/2/22 8:36 PM, Stefano Stabellini wrote:
>
> I gave it a try and there is a problem with unbinding the IRQ here. If I
> do that, later when xb_init_comms calls bind_evtchn_to_irqhandler, the
> event channel doesn't fire anymore. I did some testing and debugging and
> as far as I can tell the issue is that if we call unbind_from_irqhandler
> here, we end up calling xen_evtchn_close. Then, when xb_init_comms calls
> bind_evtchn_to_irqhandler on the same evtchn, it is not enough to
> receive event notifications anymore because from Xen POV the evtchn is
> "closed".


Ah, yes. That's unfortunate.


>
> If I comment out xen_evtchn_close() in __unbind_from_irq, then yes, I
> can call unbind_from_irqhandler here instead of free_irq and everything
> works.
>
> My suggestion is to keep the call to free_irq here (not
> unbind_from_irqhandler) and improve the in-code comment.


OK.


You could add an argument to unbind_from_irq() to keep the event channel open (I in fact am not sure it should be closing the channel since bind routines don't open it) but that looks pretty ugly too.


-boris