[PATCH 2/6] xen: add bitmap to indicate per-domain state changes

Juergen Gross posted 6 patches 1 month, 1 week ago
[PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Juergen Gross 1 month, 1 week ago
Add a bitmap with one bit per possible domid indicating the respective
domain has changed its state (created, deleted, dying, crashed,
shutdown).

Registering the VIRQ_DOM_EXC event will result in setting the bits for
all existing domains and resetting all other bits.

Resetting a bit will be done in a future patch.

This information is needed for Xenstore to keep track of all domains.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/domain.c        | 21 +++++++++++++++++++++
 xen/common/event_channel.c |  2 ++
 xen/include/xen/sched.h    |  2 ++
 3 files changed, 25 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3948640fb0..61b7899cb8 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
 
 bool __read_mostly vpmu_is_available;
 
+static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
+
+void domain_reset_states(void)
+{
+    struct domain *d;
+
+    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( d )
+        set_bit(d->domain_id, dom_state_changed);
+
+    rcu_read_unlock(&domlist_read_lock);
+}
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
@@ -152,6 +168,7 @@ static void __domain_finalise_shutdown(struct domain *d)
             return;
 
     d->is_shut_down = 1;
+    set_bit(d->domain_id, dom_state_changed);
     if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
         evtchn_send(d, d->suspend_evtchn);
     else
@@ -832,6 +849,7 @@ struct domain *domain_create(domid_t domid,
      */
     domlist_insert(d);
 
+    set_bit(d->domain_id, dom_state_changed);
     memcpy(d->handle, config->handle, sizeof(d->handle));
 
     return d;
@@ -1097,6 +1115,7 @@ int domain_kill(struct domain *d)
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
         vm_event_cleanup(d);
+        set_bit(d->domain_id, dom_state_changed);
         put_domain(d);
         send_global_virq(VIRQ_DOM_EXC);
         /* fallthrough */
@@ -1286,6 +1305,8 @@ static void cf_check complete_domain_destroy(struct rcu_head *head)
 
     xfree(d->vcpu);
 
+    set_bit(d->domain_id, dom_state_changed);
+
     _domain_destroy(d);
 
     send_global_virq(VIRQ_DOM_EXC);
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 8db2ca4ba2..9b87d29968 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         rc = evtchn_bind_virq(&bind_virq, 0);
         if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
             rc = -EFAULT; /* Cleaning up here would be a mess! */
+        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
+            domain_reset_states();
         break;
     }
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1dd8a425f9..667863263d 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -800,6 +800,8 @@ void domain_resume(struct domain *d);
 
 int domain_soft_reset(struct domain *d, bool resuming);
 
+void domain_reset_states(void);
+
 int vcpu_start_shutdown_deferral(struct vcpu *v);
 void vcpu_end_shutdown_deferral(struct vcpu *v);
 
-- 
2.43.0
Re: [PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Jan Beulich 1 month ago
On 23.10.2024 15:10, Juergen Gross wrote:
> Add a bitmap with one bit per possible domid indicating the respective
> domain has changed its state (created, deleted, dying, crashed,
> shutdown).
> 
> Registering the VIRQ_DOM_EXC event will result in setting the bits for
> all existing domains and resetting all other bits.

That's furthering the "there can be only one consumer" model that also
is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
(nothing keeps a 2nd party with sufficient privilege from invoking
XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
from whoever had first requested it), and hence I dislike this being
extended. Conceivably multiple parties may indeed be interested in
this kind of information. At which point resetting state when the vIRQ
is bound is questionable (or the data would need to become per-domain
rather than global, or even yet more fine-grained, albeit
->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
>  
>  bool __read_mostly vpmu_is_available;
>  
> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);

While it won't alter the size of the array, I think DOMID_FIRST_RESERVED
would be more logical to use here and ...

> +void domain_reset_states(void)
> +{
> +    struct domain *d;
> +
> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);

... here.

> +    rcu_read_lock(&domlist_read_lock);
> +
> +    for_each_domain ( d )
> +        set_bit(d->domain_id, dom_state_changed);

d is used only here, so could be pointer-to-const?

> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          rc = evtchn_bind_virq(&bind_virq, 0);
>          if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
>              rc = -EFAULT; /* Cleaning up here would be a mess! */
> +        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
> +            domain_reset_states();

evtchn_bind_virq() isn't static, so callers beyond the present ones could
appear without noticing the need for this special casing. Is there a reason
the check can't move into the function? Doing the check in spite of the
copy-out failing is imo still reasonable behavior.

Jan
Re: [PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Jürgen Groß 1 month ago
On 31.10.24 11:59, Jan Beulich wrote:
> On 23.10.2024 15:10, Juergen Gross wrote:
>> Add a bitmap with one bit per possible domid indicating the respective
>> domain has changed its state (created, deleted, dying, crashed,
>> shutdown).
>>
>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>> all existing domains and resetting all other bits.
> 
> That's furthering the "there can be only one consumer" model that also
> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
> (nothing keeps a 2nd party with sufficient privilege from invoking
> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
> from whoever had first requested it), and hence I dislike this being
> extended. Conceivably multiple parties may indeed be interested in
> this kind of information. At which point resetting state when the vIRQ
> is bound is questionable (or the data would need to become per-domain
> rather than global, or even yet more fine-grained, albeit
> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).

The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
event which makes the consumer look into the bitmap via the new hypercall.

If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
have one bitmap per consumer of the event. This is not very hard to
modify.

If you'd like that better, I can dynamically allocate the bitmap on
binding VIRQ_DOM_EXC and freeing it again when unbinding is done.

> 
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
>>   
>>   bool __read_mostly vpmu_is_available;
>>   
>> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
> 
> While it won't alter the size of the array, I think DOMID_FIRST_RESERVED
> would be more logical to use here and ...
> 
>> +void domain_reset_states(void)
>> +{
>> +    struct domain *d;
>> +
>> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
> 
> ... here.

Fine with me.

> 
>> +    rcu_read_lock(&domlist_read_lock);
>> +
>> +    for_each_domain ( d )
>> +        set_bit(d->domain_id, dom_state_changed);
> 
> d is used only here, so could be pointer-to-const?

Agreed.

> 
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>           rc = evtchn_bind_virq(&bind_virq, 0);
>>           if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
>>               rc = -EFAULT; /* Cleaning up here would be a mess! */
>> +        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
>> +            domain_reset_states();
> 
> evtchn_bind_virq() isn't static, so callers beyond the present ones could
> appear without noticing the need for this special casing. Is there a reason
> the check can't move into the function? Doing the check in spite of the
> copy-out failing is imo still reasonable behavior.

Moving the test into evtchn_bind_virq() should work. I'll change that.


Juergen
Re: [PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Jan Beulich 4 weeks, 1 day ago
On 01.11.2024 07:48, Jürgen Groß wrote:
> On 31.10.24 11:59, Jan Beulich wrote:
>> On 23.10.2024 15:10, Juergen Gross wrote:
>>> Add a bitmap with one bit per possible domid indicating the respective
>>> domain has changed its state (created, deleted, dying, crashed,
>>> shutdown).
>>>
>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>> all existing domains and resetting all other bits.
>>
>> That's furthering the "there can be only one consumer" model that also
>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
>> (nothing keeps a 2nd party with sufficient privilege from invoking
>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
>> from whoever had first requested it), and hence I dislike this being
>> extended. Conceivably multiple parties may indeed be interested in
>> this kind of information. At which point resetting state when the vIRQ
>> is bound is questionable (or the data would need to become per-domain
>> rather than global, or even yet more fine-grained, albeit
>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).
> 
> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
> event which makes the consumer look into the bitmap via the new hypercall.
> 
> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
> have one bitmap per consumer of the event. This is not very hard to
> modify.
> 
> If you'd like that better, I can dynamically allocate the bitmap on
> binding VIRQ_DOM_EXC and freeing it again when unbinding is done.

I'd prefer that indeed, yet I'm also curious what other maintainers think.

Jan

Re: [PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Julien Grall 2 weeks, 3 days ago
Hi Jan & Juergen,

On 04/11/2024 09:35, Jan Beulich wrote:
> On 01.11.2024 07:48, Jürgen Groß wrote:
>> On 31.10.24 11:59, Jan Beulich wrote:
>>> On 23.10.2024 15:10, Juergen Gross wrote:
>>>> Add a bitmap with one bit per possible domid indicating the respective
>>>> domain has changed its state (created, deleted, dying, crashed,
>>>> shutdown).
>>>>
>>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>>> all existing domains and resetting all other bits.
>>>
>>> That's furthering the "there can be only one consumer" model that also
>>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
>>> (nothing keeps a 2nd party with sufficient privilege from invoking
>>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
>>> from whoever had first requested it), and hence I dislike this being
>>> extended. Conceivably multiple parties may indeed be interested in
>>> this kind of information. At which point resetting state when the vIRQ
>>> is bound is questionable (or the data would need to become per-domain
>>> rather than global, or even yet more fine-grained, albeit
>>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).
>>
>> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
>> event which makes the consumer look into the bitmap via the new hypercall.
>>
>> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
>> have one bitmap per consumer of the event. This is not very hard to
>> modify.

While in principle I agree that having multiple consumers of 
VIRQ_DOM_EXC would be great. I have some scalability concern because now 
we would end up to have to update N bitmap every time. So we would need 
to put a limit to N. I don't think there is a good limit...

So overall, I am not entirely convinced it is worth the trouble.

Cheers,

-- 
Julien Grall

Re: [PATCH 2/6] xen: add bitmap to indicate per-domain state changes
Posted by Jürgen Groß 2 weeks, 2 days ago
On 16.11.24 12:01, Julien Grall wrote:
> Hi Jan & Juergen,
> 
> On 04/11/2024 09:35, Jan Beulich wrote:
>> On 01.11.2024 07:48, Jürgen Groß wrote:
>>> On 31.10.24 11:59, Jan Beulich wrote:
>>>> On 23.10.2024 15:10, Juergen Gross wrote:
>>>>> Add a bitmap with one bit per possible domid indicating the respective
>>>>> domain has changed its state (created, deleted, dying, crashed,
>>>>> shutdown).
>>>>>
>>>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>>>> all existing domains and resetting all other bits.
>>>>
>>>> That's furthering the "there can be only one consumer" model that also
>>>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed
>>>> (nothing keeps a 2nd party with sufficient privilege from invoking
>>>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification
>>>> from whoever had first requested it), and hence I dislike this being
>>>> extended. Conceivably multiple parties may indeed be interested in
>>>> this kind of information. At which point resetting state when the vIRQ
>>>> is bound is questionable (or the data would need to become per-domain
>>>> rather than global, or even yet more fine-grained, albeit
>>>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s).
>>>
>>> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that
>>> event which makes the consumer look into the bitmap via the new hypercall.
>>>
>>> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to
>>> have one bitmap per consumer of the event. This is not very hard to
>>> modify.
> 
> While in principle I agree that having multiple consumers of VIRQ_DOM_EXC would 
> be great. I have some scalability concern because now we would end up to have to 
> update N bitmap every time. So we would need to put a limit to N. I don't think 
> there is a good limit...

The same applies regarding sending an event. I don't think the additional
setting of a bit is adding a relevant amount of processing time.

I agree that a limit is hard to find, but it could be rather high.

> So overall, I am not entirely convinced it is worth the trouble.

The only real reason I could see would be a setup without Xenstore. Reason
is that with Xenstore all interested parties could register a watch event
instead of directly consuming the VIRQ_DOM_EXC event.

We haven't needed multiple VIRQ_DOM_EXC consumers up to now, so I don't
think we should over-engineer the interface. At least there is a theoretical
solution for multiple consumers, so my patch series wouldn't introduce a
no-go for multiple consumers.


Juergen