[PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred

Juergen Gross posted 1 patch 3 years, 12 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200428155144.8253-1-jgross@suse.com
Maintainers: Wei Liu <wl@xen.org>, Ian Jackson <ian.jackson@eu.citrix.com>
tools/xenstore/xenstored_domain.c | 47 ++++++++++++++++-----------------------
1 file changed, 19 insertions(+), 28 deletions(-)
[PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Juergen Gross 3 years, 12 months ago
The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
of the domain's xenstore ring page and the event channel of the
domain for communicating with Xenstore.

The gfn is not really needed. It is stored in the per-domain struct
in xenstored and in case of another XS_INTRODUCE for the domain it
is tested to match the original value. If it doesn't match the
command is aborted via EINVAL.

Today there shouldn't be multiple XS_INTRODUCE requests for the same
domain issued, so the mfn/gfn can just be ignored and multiple
XS_INTRODUCE commands can be rejected without testing the mfn/gfn.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_domain.c | 47 ++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 5858185211..17328f9fc9 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 	struct domain *domain;
 	char *vec[3];
 	unsigned int domid;
-	unsigned long mfn;
 	evtchn_port_t port;
 	int rc;
 	struct xenstore_domain_interface *interface;
@@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 		return EACCES;
 
 	domid = atoi(vec[0]);
-	mfn = atol(vec[1]);
+	/* Ignore the mfn, we don't need it. */
 	port = atoi(vec[2]);
 
 	/* Sanity check args. */
@@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 
 	domain = find_domain_by_domid(domid);
 
-	if (domain == NULL) {
-		interface = map_interface(domid);
-		if (!interface)
-			return errno;
-		/* Hang domain off "in" until we're finished. */
-		domain = new_domain(in, domid, port);
-		if (!domain) {
-			rc = errno;
-			unmap_interface(interface);
-			return rc;
-		}
-		domain->interface = interface;
-		domain->mfn = mfn;
-
-		/* Now domain belongs to its connection. */
-		talloc_steal(domain->conn, domain);
-
-		fire_watches(NULL, in, "@introduceDomain", false);
-	} else if ((domain->mfn == mfn) && (domain->conn != conn)) {
-		/* Use XS_INTRODUCE for recreating the xenbus event-channel. */
-		if (domain->port)
-			xenevtchn_unbind(xce_handle, domain->port);
-		rc = xenevtchn_bind_interdomain(xce_handle, domid, port);
-		domain->port = (rc == -1) ? 0 : rc;
-		domain->remote_port = port;
-	} else
+	if (domain)
 		return EINVAL;
 
+	interface = map_interface(domid);
+	if (!interface)
+		return errno;
+	/* Hang domain off "in" until we're finished. */
+	domain = new_domain(in, domid, port);
+	if (!domain) {
+		rc = errno;
+		unmap_interface(interface);
+		return rc;
+	}
+	domain->interface = interface;
+
+	/* Now domain belongs to its connection. */
+	talloc_steal(domain->conn, domain);
+
+	fire_watches(NULL, in, "@introduceDomain", false);
+
 	domain_conn_reset(domain);
 
 	send_ack(conn, XS_INTRODUCE);
-- 
2.16.4


Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Julien Grall 3 years, 12 months ago
Hi Juergen,

On Tue, 28 Apr 2020 at 16:53, Juergen Gross <jgross@suse.com> wrote:
>
> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
> of the domain's xenstore ring page and the event channel of the
> domain for communicating with Xenstore.
>
> The gfn is not really needed. It is stored in the per-domain struct
> in xenstored and in case of another XS_INTRODUCE for the domain it
> is tested to match the original value. If it doesn't match the
> command is aborted via EINVAL.
>
> Today there shouldn't be multiple XS_INTRODUCE requests for the same
> domain issued, so the mfn/gfn can just be ignored and multiple
> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.

So there is a comment in the else part:

/* Use XS_INTRODUCE for recreating the xenbus event-channel. */

From the commit message this is not entirely clear why we want to
prevent recreating the event-channel. Can you expand it?

>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  tools/xenstore/xenstored_domain.c | 47 ++++++++++++++++-----------------------
>  1 file changed, 19 insertions(+), 28 deletions(-)
>
> diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
> index 5858185211..17328f9fc9 100644
> --- a/tools/xenstore/xenstored_domain.c
> +++ b/tools/xenstore/xenstored_domain.c
> @@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>         struct domain *domain;
>         char *vec[3];
>         unsigned int domid;
> -       unsigned long mfn;
>         evtchn_port_t port;
>         int rc;
>         struct xenstore_domain_interface *interface;
> @@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>                 return EACCES;
>
>         domid = atoi(vec[0]);
> -       mfn = atol(vec[1]);
> +       /* Ignore the mfn, we don't need it. */

s/mfn/GFN/

>         port = atoi(vec[2]);
>
>         /* Sanity check args. */
> @@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>
>         domain = find_domain_by_domid(domid);
>
> -       if (domain == NULL) {
> -               interface = map_interface(domid);
> -               if (!interface)
> -                       return errno;
> -               /* Hang domain off "in" until we're finished. */
> -               domain = new_domain(in, domid, port);
> -               if (!domain) {
> -                       rc = errno;
> -                       unmap_interface(interface);
> -                       return rc;
> -               }
> -               domain->interface = interface;
> -               domain->mfn = mfn;

AFAICT domain->mfn is not used anymore, so could we remove the field?

Cheers,

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Jürgen Groß 3 years, 12 months ago
On 28.04.20 22:55, Julien Grall wrote:
> Hi Juergen,
> 
> On Tue, 28 Apr 2020 at 16:53, Juergen Gross <jgross@suse.com> wrote:
>>
>> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
>> of the domain's xenstore ring page and the event channel of the
>> domain for communicating with Xenstore.
>>
>> The gfn is not really needed. It is stored in the per-domain struct
>> in xenstored and in case of another XS_INTRODUCE for the domain it
>> is tested to match the original value. If it doesn't match the
>> command is aborted via EINVAL.
>>
>> Today there shouldn't be multiple XS_INTRODUCE requests for the same
>> domain issued, so the mfn/gfn can just be ignored and multiple
>> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.
> 
> So there is a comment in the else part:
> 
> /* Use XS_INTRODUCE for recreating the xenbus event-channel. */
> 
>  From the commit message this is not entirely clear why we want to
> prevent recreating the event-channel. Can you expand it?

Recreating the event channel would be fine, but I don't see why it
would ever be needed. And XS_INTRODUCE is called only at domain creation
time today, so there is obviously no need for repeating this call.

Maybe the idea was to do this after sending a XS_RESUME command, which
isn't used anywhere in Xen and is another part of Xenstore which doesn't
make any sense today.

> 
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   tools/xenstore/xenstored_domain.c | 47 ++++++++++++++++-----------------------
>>   1 file changed, 19 insertions(+), 28 deletions(-)
>>
>> diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
>> index 5858185211..17328f9fc9 100644
>> --- a/tools/xenstore/xenstored_domain.c
>> +++ b/tools/xenstore/xenstored_domain.c
>> @@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>>          struct domain *domain;
>>          char *vec[3];
>>          unsigned int domid;
>> -       unsigned long mfn;
>>          evtchn_port_t port;
>>          int rc;
>>          struct xenstore_domain_interface *interface;
>> @@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>>                  return EACCES;
>>
>>          domid = atoi(vec[0]);
>> -       mfn = atol(vec[1]);
>> +       /* Ignore the mfn, we don't need it. */
> 
> s/mfn/GFN/

Okay, then I should probably change the parameter description, too.

> 
>>          port = atoi(vec[2]);
>>
>>          /* Sanity check args. */
>> @@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
>>
>>          domain = find_domain_by_domid(domid);
>>
>> -       if (domain == NULL) {
>> -               interface = map_interface(domid);
>> -               if (!interface)
>> -                       return errno;
>> -               /* Hang domain off "in" until we're finished. */
>> -               domain = new_domain(in, domid, port);
>> -               if (!domain) {
>> -                       rc = errno;
>> -                       unmap_interface(interface);
>> -                       return rc;
>> -               }
>> -               domain->interface = interface;
>> -               domain->mfn = mfn;
> 
> AFAICT domain->mfn is not used anymore, so could we remove the field?

Oh, yes, I missed that.


Juergen

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Julien Grall 3 years, 12 months ago
Hi Juergen,

On 29/04/2020 06:51, Jürgen Groß wrote:
> On 28.04.20 22:55, Julien Grall wrote:
>> Hi Juergen,
>>
>> On Tue, 28 Apr 2020 at 16:53, Juergen Gross <jgross@suse.com> wrote:
>>>
>>> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
>>> of the domain's xenstore ring page and the event channel of the
>>> domain for communicating with Xenstore.
>>>
>>> The gfn is not really needed. It is stored in the per-domain struct
>>> in xenstored and in case of another XS_INTRODUCE for the domain it
>>> is tested to match the original value. If it doesn't match the
>>> command is aborted via EINVAL.
>>>
>>> Today there shouldn't be multiple XS_INTRODUCE requests for the same
>>> domain issued, so the mfn/gfn can just be ignored and multiple
>>> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.
>>
>> So there is a comment in the else part:
>>
>> /* Use XS_INTRODUCE for recreating the xenbus event-channel. */
>>
>>  From the commit message this is not entirely clear why we want to
>> prevent recreating the event-channel. Can you expand it?
> 
> Recreating the event channel would be fine, but I don't see why it
> would ever be needed. And XS_INTRODUCE is called only at domain creation
> time today, so there is obviously no need for repeating this call.
> 
> Maybe the idea was to do this after sending a XS_RESUME command, which
> isn't used anywhere in Xen and is another part of Xenstore which doesn't
> make any sense today.

Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port 
XS_INTRODUCE evtchn rebind function from cxenstored" added the exact 
same behavior in the OCaml XenStored last year.

This was introduced 12 years after C XenStored, so surely someone think 
this is useful. We should check why this was introduced in OCaml 
XenStored (I have CCed the author of the patch).

If we still think this is not useful, then you should add an explanation 
in the commit message why the two implementations diverge and possibly 
update the spec.

Cheers,

-- 
Julien Grall

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 10:22, Julien Grall wrote:
> Hi Juergen,
> 
> On 29/04/2020 06:51, Jürgen Groß wrote:
>>
>> Recreating the event channel would be fine, but I don't see why it
>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>> time today, so there is obviously no need for repeating this call.
>>
>> Maybe the idea was to do this after sending a XS_RESUME command, which
>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>> make any sense today.
> 
> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
> 
> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
> 
> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.

Thanks for CC, Julien.

We indeed already use this functionality in our toolstack for guest kdump
functions. It's not possible in XAPI to adopt libxl model where almost everything
is restarted for a domain entering kdump, so we have to use this message to
rebind xenstore evtchn after soft reset without shutting down backends and
recreating the whole subtree (frontends reconnect fine after that).

We obviously only require it for now to be present in oxenstored only.
Please don't remove this functionality if possible.

Igor 

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Jürgen Groß 3 years, 12 months ago
On 29.04.20 12:39, Igor Druzhinin wrote:
> On 29/04/2020 10:22, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 29/04/2020 06:51, Jürgen Groß wrote:
>>>
>>> Recreating the event channel would be fine, but I don't see why it
>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>>> time today, so there is obviously no need for repeating this call.
>>>
>>> Maybe the idea was to do this after sending a XS_RESUME command, which
>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>>> make any sense today.
>>
>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
>>
>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
>>
>> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.
> 
> Thanks for CC, Julien.
> 
> We indeed already use this functionality in our toolstack for guest kdump
> functions. It's not possible in XAPI to adopt libxl model where almost everything
> is restarted for a domain entering kdump, so we have to use this message to
> rebind xenstore evtchn after soft reset without shutting down backends and
> recreating the whole subtree (frontends reconnect fine after that).
> 
> We obviously only require it for now to be present in oxenstored only.
> Please don't remove this functionality if possible.

If I read handling in libxl correctly, in the soft reset case XS_RELEASE
is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
xenstored throwing away its related struct domain, so XS_INTRODUCE will
be possible again.


Juergen

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 11:49, Jürgen Groß wrote:
> On 29.04.20 12:39, Igor Druzhinin wrote:
>> On 29/04/2020 10:22, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 29/04/2020 06:51, Jürgen Groß wrote:
>>>>
>>>> Recreating the event channel would be fine, but I don't see why it
>>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>>>> time today, so there is obviously no need for repeating this call.
>>>>
>>>> Maybe the idea was to do this after sending a XS_RESUME command, which
>>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>>>> make any sense today.
>>>
>>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
>>>
>>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
>>>
>>> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.
>>
>> Thanks for CC, Julien.
>>
>> We indeed already use this functionality in our toolstack for guest kdump
>> functions. It's not possible in XAPI to adopt libxl model where almost everything
>> is restarted for a domain entering kdump, so we have to use this message to
>> rebind xenstore evtchn after soft reset without shutting down backends and
>> recreating the whole subtree (frontends reconnect fine after that).
>>
>> We obviously only require it for now to be present in oxenstored only.
>> Please don't remove this functionality if possible.
> 
> If I read handling in libxl correctly, in the soft reset case XS_RELEASE
> is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
> xenstored throwing away its related struct domain, so XS_INTRODUCE will
> be possible again.

From what I remember it was not possible to keep xenstored data for a domain
and at the same time perform release-introduce cycle (at least in oxenstored).
It also involved firing @releaseDomain which caused havoc in other part of
the toolstack.

Igor

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Jürgen Groß 3 years, 12 months ago
On 29.04.20 13:04, Igor Druzhinin wrote:
> On 29/04/2020 11:49, Jürgen Groß wrote:
>> On 29.04.20 12:39, Igor Druzhinin wrote:
>>> On 29/04/2020 10:22, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 29/04/2020 06:51, Jürgen Groß wrote:
>>>>>
>>>>> Recreating the event channel would be fine, but I don't see why it
>>>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>>>>> time today, so there is obviously no need for repeating this call.
>>>>>
>>>>> Maybe the idea was to do this after sending a XS_RESUME command, which
>>>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>>>>> make any sense today.
>>>>
>>>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
>>>>
>>>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
>>>>
>>>> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.
>>>
>>> Thanks for CC, Julien.
>>>
>>> We indeed already use this functionality in our toolstack for guest kdump
>>> functions. It's not possible in XAPI to adopt libxl model where almost everything
>>> is restarted for a domain entering kdump, so we have to use this message to
>>> rebind xenstore evtchn after soft reset without shutting down backends and
>>> recreating the whole subtree (frontends reconnect fine after that).
>>>
>>> We obviously only require it for now to be present in oxenstored only.
>>> Please don't remove this functionality if possible.
>>
>> If I read handling in libxl correctly, in the soft reset case XS_RELEASE
>> is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
>> xenstored throwing away its related struct domain, so XS_INTRODUCE will
>> be possible again.
> 
>  From what I remember it was not possible to keep xenstored data for a domain
> and at the same time perform release-introduce cycle (at least in oxenstored).
> It also involved firing @releaseDomain which caused havoc in other part of
> the toolstack.

Wei, Ian, can you please tell me where I'm wrong?

A soft reset should restart the domain in a clean state. AFAIK libxl is
handling that by doing kind of in-place save-restore, including calling
xs_release_domain() and later xs_introduce_domain(). This should result
in xenstored throwing away all state it has regarding the domain and
then restarting with a new (internal) domain instance.

Is XAPI doing soft reset differently? Why should there be a need for
keeping xenstored data across a soft reset?


Juergen

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 13:29, Jürgen Groß wrote:
> On 29.04.20 13:04, Igor Druzhinin wrote:
>> On 29/04/2020 11:49, Jürgen Groß wrote:
>>> On 29.04.20 12:39, Igor Druzhinin wrote:
>>>> On 29/04/2020 10:22, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 29/04/2020 06:51, Jürgen Groß wrote:
>>>>>>
>>>>>> Recreating the event channel would be fine, but I don't see why it
>>>>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>>>>>> time today, so there is obviously no need for repeating this call.
>>>>>>
>>>>>> Maybe the idea was to do this after sending a XS_RESUME command, which
>>>>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>>>>>> make any sense today.
>>>>>
>>>>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
>>>>>
>>>>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
>>>>>
>>>>> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.
>>>>
>>>> Thanks for CC, Julien.
>>>>
>>>> We indeed already use this functionality in our toolstack for guest kdump
>>>> functions. It's not possible in XAPI to adopt libxl model where almost everything
>>>> is restarted for a domain entering kdump, so we have to use this message to
>>>> rebind xenstore evtchn after soft reset without shutting down backends and
>>>> recreating the whole subtree (frontends reconnect fine after that).
>>>>
>>>> We obviously only require it for now to be present in oxenstored only.
>>>> Please don't remove this functionality if possible.
>>>
>>> If I read handling in libxl correctly, in the soft reset case XS_RELEASE
>>> is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
>>> xenstored throwing away its related struct domain, so XS_INTRODUCE will
>>> be possible again.
>>
>>  From what I remember it was not possible to keep xenstored data for a domain
>> and at the same time perform release-introduce cycle (at least in oxenstored).
>> It also involved firing @releaseDomain which caused havoc in other part of
>> the toolstack.
> 
> Wei, Ian, can you please tell me where I'm wrong?
> 
> A soft reset should restart the domain in a clean state. AFAIK libxl is
> handling that by doing kind of in-place save-restore, including calling
> xs_release_domain() and later xs_introduce_domain(). This should result
> in xenstored throwing away all state it has regarding the domain and
> then restarting with a new (internal) domain instance.
> 
> Is XAPI doing soft reset differently? Why should there be a need for
> keeping xenstored data across a soft reset?

Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
data to at least keep backends working without the need to reinitialize them 
before entering kdump kernel. We do the same thing while entering crash kernel
in Windows after BSOD (CC Paul as he recommended this approach).
There are other reasons to not reset xenstore data.

I considered XS_INTRODUCE functionality to be part of xenstored ABI and we built
a lot of infrastructure on top of that. So I don't think it could be easily removed now.

Igor

RE: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Paul Durrant 3 years, 12 months ago
> -----Original Message-----
> From: Igor Druzhinin <igor.druzhinin@citrix.com>
> Sent: 29 April 2020 13:50
> To: Jürgen Groß <jgross@suse.com>; Julien Grall <julien@xen.org>; Julien Grall
> <julien.grall.oss@gmail.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu
> <wl@xen.org>; andrew.cooper3@citrix.com; Paul Durrant <paul@xen.org>
> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
> 
> On 29/04/2020 13:29, Jürgen Groß wrote:
> > On 29.04.20 13:04, Igor Druzhinin wrote:
> >> On 29/04/2020 11:49, Jürgen Groß wrote:
> >>> On 29.04.20 12:39, Igor Druzhinin wrote:
> >>>> On 29/04/2020 10:22, Julien Grall wrote:
> >>>>> Hi Juergen,
> >>>>>
> >>>>> On 29/04/2020 06:51, Jürgen Groß wrote:
> >>>>>>
> >>>>>> Recreating the event channel would be fine, but I don't see why it
> >>>>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
> >>>>>> time today, so there is obviously no need for repeating this call.
> >>>>>>
> >>>>>> Maybe the idea was to do this after sending a XS_RESUME command, which
> >>>>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
> >>>>>> make any sense today.
> >>>>>
> >>>>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn
> rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
> >>>>>
> >>>>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We
> should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
> >>>>>
> >>>>> If we still think this is not useful, then you should add an explanation in the commit message
> why the two implementations diverge and possibly update the spec.
> >>>>
> >>>> Thanks for CC, Julien.
> >>>>
> >>>> We indeed already use this functionality in our toolstack for guest kdump
> >>>> functions. It's not possible in XAPI to adopt libxl model where almost everything
> >>>> is restarted for a domain entering kdump, so we have to use this message to
> >>>> rebind xenstore evtchn after soft reset without shutting down backends and
> >>>> recreating the whole subtree (frontends reconnect fine after that).
> >>>>
> >>>> We obviously only require it for now to be present in oxenstored only.
> >>>> Please don't remove this functionality if possible.
> >>>
> >>> If I read handling in libxl correctly, in the soft reset case XS_RELEASE
> >>> is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
> >>> xenstored throwing away its related struct domain, so XS_INTRODUCE will
> >>> be possible again.
> >>
> >>  From what I remember it was not possible to keep xenstored data for a domain
> >> and at the same time perform release-introduce cycle (at least in oxenstored).
> >> It also involved firing @releaseDomain which caused havoc in other part of
> >> the toolstack.
> >
> > Wei, Ian, can you please tell me where I'm wrong?
> >
> > A soft reset should restart the domain in a clean state. AFAIK libxl is
> > handling that by doing kind of in-place save-restore, including calling
> > xs_release_domain() and later xs_introduce_domain(). This should result
> > in xenstored throwing away all state it has regarding the domain and
> > then restarting with a new (internal) domain instance.
> >
> > Is XAPI doing soft reset differently? Why should there be a need for
> > keeping xenstored data across a soft reset?
> 
> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
> data to at least keep backends working without the need to reinitialize them
> before entering kdump kernel. We do the same thing while entering crash kernel
> in Windows after BSOD (CC Paul as he recommended this approach).

IIRC I recommended not involving Xen or the toolstack in entering the crash kernel... they don't need to know. Windows quite happily enters its crash kernel, rebuilds its Xen interfaces and re-attaches to PV backends without any external intervention.

  Paul

> There are other reasons to not reset xenstore data.
> 
> I considered XS_INTRODUCE functionality to be part of xenstored ABI and we built
> a lot of infrastructure on top of that. So I don't think it could be easily removed now.
> 
> Igor


Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 13:56, Paul Durrant wrote:
>> -----Original Message-----
>> From: Igor Druzhinin <igor.druzhinin@citrix.com>
>> Sent: 29 April 2020 13:50
>> To: Jürgen Groß <jgross@suse.com>; Julien Grall <julien@xen.org>; Julien Grall
>> <julien.grall.oss@gmail.com>
>> Cc: xen-devel <xen-devel@lists.xenproject.org>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu
>> <wl@xen.org>; andrew.cooper3@citrix.com; Paul Durrant <paul@xen.org>
>> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
>>
>> On 29/04/2020 13:29, Jürgen Groß wrote:
>>>
>>> Wei, Ian, can you please tell me where I'm wrong?
>>>
>>> A soft reset should restart the domain in a clean state. AFAIK libxl is
>>> handling that by doing kind of in-place save-restore, including calling
>>> xs_release_domain() and later xs_introduce_domain(). This should result
>>> in xenstored throwing away all state it has regarding the domain and
>>> then restarting with a new (internal) domain instance.
>>>
>>> Is XAPI doing soft reset differently? Why should there be a need for
>>> keeping xenstored data across a soft reset?
>>
>> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
>> data to at least keep backends working without the need to reinitialize them
>> before entering kdump kernel. We do the same thing while entering crash kernel
>> in Windows after BSOD (CC Paul as he recommended this approach).
> 
> IIRC I recommended not involving Xen or the toolstack in entering the crash kernel... they don't need to know. Windows quite happily enters its crash kernel, rebuilds its Xen interfaces and re-attaches to PV backends without any external intervention.

In case of kdump toolstack certainly needs to know as it gets shutdown code 5 and
needs to restart the domain.

And I'm not completely sure it's possible to enter kdump without calling soft reset
at all. Even if it's possible I'd be cautious to do it in production for the future.

Igor

RE: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Paul Durrant 3 years, 12 months ago
> -----Original Message-----
> From: Igor Druzhinin <igor.druzhinin@citrix.com>
> Sent: 29 April 2020 14:17
> To: paul@xen.org; 'Jürgen Groß' <jgross@suse.com>; 'Julien Grall' <julien@xen.org>; 'Julien Grall'
> <julien.grall.oss@gmail.com>
> Cc: 'xen-devel' <xen-devel@lists.xenproject.org>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Wei Liu'
> <wl@xen.org>; andrew.cooper3@citrix.com
> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
> 
> On 29/04/2020 13:56, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Igor Druzhinin <igor.druzhinin@citrix.com>
> >> Sent: 29 April 2020 13:50
> >> To: Jürgen Groß <jgross@suse.com>; Julien Grall <julien@xen.org>; Julien Grall
> >> <julien.grall.oss@gmail.com>
> >> Cc: xen-devel <xen-devel@lists.xenproject.org>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu
> >> <wl@xen.org>; andrew.cooper3@citrix.com; Paul Durrant <paul@xen.org>
> >> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
> >>
> >> On 29/04/2020 13:29, Jürgen Groß wrote:
> >>>
> >>> Wei, Ian, can you please tell me where I'm wrong?
> >>>
> >>> A soft reset should restart the domain in a clean state. AFAIK libxl is
> >>> handling that by doing kind of in-place save-restore, including calling
> >>> xs_release_domain() and later xs_introduce_domain(). This should result
> >>> in xenstored throwing away all state it has regarding the domain and
> >>> then restarting with a new (internal) domain instance.
> >>>
> >>> Is XAPI doing soft reset differently? Why should there be a need for
> >>> keeping xenstored data across a soft reset?
> >>
> >> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
> >> data to at least keep backends working without the need to reinitialize them
> >> before entering kdump kernel. We do the same thing while entering crash kernel
> >> in Windows after BSOD (CC Paul as he recommended this approach).
> >
> > IIRC I recommended not involving Xen or the toolstack in entering the crash kernel... they don't
> need to know. Windows quite happily enters its crash kernel, rebuilds its Xen interfaces and re-
> attaches to PV backends without any external intervention.
> 
> In case of kdump toolstack certainly needs to know as it gets shutdown code 5 and
> needs to restart the domain.
> 

The toolstack needs to restart the domain once the crash has completed, yes.

> And I'm not completely sure it's possible to enter kdump without calling soft reset
> at all. Even if it's possible I'd be cautious to do it in production for the future.
> 

If it is not possible at the moment then IMO it should be made so; having soft reset in the toolstack is a layering violation IMO.

  Paul



Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 14:22, Paul Durrant wrote:
>> -----Original Message-----
>> From: Igor Druzhinin <igor.druzhinin@citrix.com>
>> Sent: 29 April 2020 14:17
>> To: paul@xen.org; 'Jürgen Groß' <jgross@suse.com>; 'Julien Grall' <julien@xen.org>; 'Julien Grall'
>> <julien.grall.oss@gmail.com>
>> Cc: 'xen-devel' <xen-devel@lists.xenproject.org>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Wei Liu'
>> <wl@xen.org>; andrew.cooper3@citrix.com
>> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
>>
>> On 29/04/2020 13:56, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Igor Druzhinin <igor.druzhinin@citrix.com>
>>>> Sent: 29 April 2020 13:50
>>>> To: Jürgen Groß <jgross@suse.com>; Julien Grall <julien@xen.org>; Julien Grall
>>>> <julien.grall.oss@gmail.com>
>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu
>>>> <wl@xen.org>; andrew.cooper3@citrix.com; Paul Durrant <paul@xen.org>
>>>> Subject: Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
>>>>
>>>> On 29/04/2020 13:29, Jürgen Groß wrote:
>>>>>
>>>>> Wei, Ian, can you please tell me where I'm wrong?
>>>>>
>>>>> A soft reset should restart the domain in a clean state. AFAIK libxl is
>>>>> handling that by doing kind of in-place save-restore, including calling
>>>>> xs_release_domain() and later xs_introduce_domain(). This should result
>>>>> in xenstored throwing away all state it has regarding the domain and
>>>>> then restarting with a new (internal) domain instance.
>>>>>
>>>>> Is XAPI doing soft reset differently? Why should there be a need for
>>>>> keeping xenstored data across a soft reset?
>>>>
>>>> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
>>>> data to at least keep backends working without the need to reinitialize them
>>>> before entering kdump kernel. We do the same thing while entering crash kernel
>>>> in Windows after BSOD (CC Paul as he recommended this approach).
>>>
>>> IIRC I recommended not involving Xen or the toolstack in entering the crash kernel... they don't
>> need to know. Windows quite happily enters its crash kernel, rebuilds its Xen interfaces and re-
>> attaches to PV backends without any external intervention.
>>
>> In case of kdump toolstack certainly needs to know as it gets shutdown code 5 and
>> needs to restart the domain.
>>
> 
> The toolstack needs to restart the domain once the crash has completed, yes.

To clarify, what I meant is that once crash happened (before kdump kernel is loaded)
toolstack gets code 5 and then it's supposed to call soft reset and unpause the domain
to actually load into crash kernel.

I didn't mean that after crash kernel is finished the domain has to be restarted - that's
obvious.

Igor

 

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Jürgen Groß 3 years, 12 months ago
On 29.04.20 14:49, Igor Druzhinin wrote:
> On 29/04/2020 13:29, Jürgen Groß wrote:
>> On 29.04.20 13:04, Igor Druzhinin wrote:
>>> On 29/04/2020 11:49, Jürgen Groß wrote:
>>>> On 29.04.20 12:39, Igor Druzhinin wrote:
>>>>> On 29/04/2020 10:22, Julien Grall wrote:
>>>>>> Hi Juergen,
>>>>>>
>>>>>> On 29/04/2020 06:51, Jürgen Groß wrote:
>>>>>>>
>>>>>>> Recreating the event channel would be fine, but I don't see why it
>>>>>>> would ever be needed. And XS_INTRODUCE is called only at domain creation
>>>>>>> time today, so there is obviously no need for repeating this call.
>>>>>>>
>>>>>>> Maybe the idea was to do this after sending a XS_RESUME command, which
>>>>>>> isn't used anywhere in Xen and is another part of Xenstore which doesn't
>>>>>>> make any sense today.
>>>>>>
>>>>>> Commit f6cc37ea8ac71385b60507c034519f304da75f4c "tools/oxenstored: port XS_INTRODUCE evtchn rebind function from cxenstored" added the exact same behavior in the OCaml XenStored last year.
>>>>>>
>>>>>> This was introduced 12 years after C XenStored, so surely someone think this is useful. We should check why this was introduced in OCaml XenStored (I have CCed the author of the patch).
>>>>>>
>>>>>> If we still think this is not useful, then you should add an explanation in the commit message why the two implementations diverge and possibly update the spec.
>>>>>
>>>>> Thanks for CC, Julien.
>>>>>
>>>>> We indeed already use this functionality in our toolstack for guest kdump
>>>>> functions. It's not possible in XAPI to adopt libxl model where almost everything
>>>>> is restarted for a domain entering kdump, so we have to use this message to
>>>>> rebind xenstore evtchn after soft reset without shutting down backends and
>>>>> recreating the whole subtree (frontends reconnect fine after that).
>>>>>
>>>>> We obviously only require it for now to be present in oxenstored only.
>>>>> Please don't remove this functionality if possible.
>>>>
>>>> If I read handling in libxl correctly, in the soft reset case XS_RELEASE
>>>> is issued before doing another XS_INTRODUCE. XS_RELEASE will result in
>>>> xenstored throwing away its related struct domain, so XS_INTRODUCE will
>>>> be possible again.
>>>
>>>   From what I remember it was not possible to keep xenstored data for a domain
>>> and at the same time perform release-introduce cycle (at least in oxenstored).
>>> It also involved firing @releaseDomain which caused havoc in other part of
>>> the toolstack.
>>
>> Wei, Ian, can you please tell me where I'm wrong?
>>
>> A soft reset should restart the domain in a clean state. AFAIK libxl is
>> handling that by doing kind of in-place save-restore, including calling
>> xs_release_domain() and later xs_introduce_domain(). This should result
>> in xenstored throwing away all state it has regarding the domain and
>> then restarting with a new (internal) domain instance.
>>
>> Is XAPI doing soft reset differently? Why should there be a need for
>> keeping xenstored data across a soft reset?
> 
> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
> data to at least keep backends working without the need to reinitialize them
> before entering kdump kernel. We do the same thing while entering crash kernel
> in Windows after BSOD (CC Paul as he recommended this approach).
> There are other reasons to not reset xenstore data.
> 
> I considered XS_INTRODUCE functionality to be part of xenstored ABI and we built
> a lot of infrastructure on top of that. So I don't think it could be easily removed now.

Nobody wants to remove XS_INTRODUCE. It was just questioned there is a
need to call XS_INTRODUCE multiple times for a domain without a call of
XS_RELEASE in between.

In case there is such a need this means we either should just drop the
test for the mfn to match and recreate the event-channel (which will do
no harm IMO), or we need to keep the mfn in the live-update state record
with the knowledge that it is far from being a good indicator for the
test whether a domain is known already or not (two HVM domains using a
similar configuration with the very same kernel will use the same gfn
probably).

As only dom0 is allowed to use XS_INTRODUCE and I'm not aware of any
problems with neither XS_INTRODUCE failing due to illegal calls (no mfn
match), nor with potentially recreating the event channel for a "wrong"
domain, I suggest to just allow XS_INTRODUCE to be called as often as
the toolstack wants to.

In the end that was the primary reason to write this patch: to remove
the need to have the mfn in the live-update state record.

Thoughts?


Juergen

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Igor Druzhinin 3 years, 12 months ago
On 29/04/2020 14:06, Jürgen Groß wrote:
> On 29.04.20 14:49, Igor Druzhinin wrote:
>> On 29/04/2020 13:29, Jürgen Groß wrote:
>>> On 29.04.20 13:04, Igor Druzhinin wrote:
>>
>> Yes, XAPI is doing soft reset differently from libxl. You need to keep xenstore
>> data to at least keep backends working without the need to reinitialize them
>> before entering kdump kernel. We do the same thing while entering crash kernel
>> in Windows after BSOD (CC Paul as he recommended this approach).
>> There are other reasons to not reset xenstore data.
>>
>> I considered XS_INTRODUCE functionality to be part of xenstored ABI and we built
>> a lot of infrastructure on top of that. So I don't think it could be easily removed now.
> 
> Nobody wants to remove XS_INTRODUCE. It was just questioned there is a
> need to call XS_INTRODUCE multiple times for a domain without a call of
> XS_RELEASE in between.

Sorry, I didn't mean the whole XS_INTRODUCE but that particular bit that
allows it to rebind the channels without calling XS_RELEASE first as the
latter has a lot of implications from toolstack point of view.

> In case there is such a need this means we either should just drop the
> test for the mfn to match and recreate the event-channel (which will do
> no harm IMO), or we need to keep the mfn in the live-update state record
> with the knowledge that it is far from being a good indicator for the
> test whether a domain is known already or not (two HVM domains using a
> similar configuration with the very same kernel will use the same gfn
> probably).
> 
> As only dom0 is allowed to use XS_INTRODUCE and I'm not aware of any
> problems with neither XS_INTRODUCE failing due to illegal calls (no mfn
> match), nor with potentially recreating the event channel for a "wrong"
> domain, I suggest to just allow XS_INTRODUCE to be called as often as
> the toolstack wants to.

If that means keeping the current semantics of XS_INTRODUCE - I'd be quite
happy with that.

Igor

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred
Posted by Andrew Cooper 3 years, 12 months ago
On 28/04/2020 16:51, Juergen Gross wrote:
> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
> of the domain's xenstore ring page and the event channel of the
> domain for communicating with Xenstore.
>
> The gfn is not really needed. It is stored in the per-domain struct
> in xenstored and in case of another XS_INTRODUCE for the domain it
> is tested to match the original value. If it doesn't match the
> command is aborted via EINVAL.
>
> Today there shouldn't be multiple XS_INTRODUCE requests for the same
> domain issued, so the mfn/gfn can just be ignored and multiple
> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.
>
> Signed-off-by: Juergen Gross <jgross@suse.com>

In hindsight, this is cleanup following c/s 38eeb3864d "tools/xenstored:
Drop mapping of the ring via foreign map".

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>