Add a bitmap with one bit per possible domid indicating the respective
domain has changed its state (created, deleted, dying, crashed,
shutdown).
Registering the VIRQ_DOM_EXC event will result in setting the bits for
all existing domains and resetting all other bits.
Resetting a bit will be done in a future patch.
This information is needed for Xenstore to keep track of all domains.
Signed-off-by: Juergen Gross <jgross@suse.com>
---
xen/common/domain.c | 21 +++++++++++++++++++++
xen/common/event_channel.c | 2 ++
xen/include/xen/sched.h | 2 ++
3 files changed, 25 insertions(+)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3948640fb0..61b7899cb8 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available;
bool __read_mostly vpmu_is_available;
+static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
+
+void domain_reset_states(void)
+{
+ struct domain *d;
+
+ bitmap_zero(dom_state_changed, DOMID_MASK + 1);
+
+ rcu_read_lock(&domlist_read_lock);
+
+ for_each_domain ( d )
+ set_bit(d->domain_id, dom_state_changed);
+
+ rcu_read_unlock(&domlist_read_lock);
+}
+
static void __domain_finalise_shutdown(struct domain *d)
{
struct vcpu *v;
@@ -152,6 +168,7 @@ static void __domain_finalise_shutdown(struct domain *d)
return;
d->is_shut_down = 1;
+ set_bit(d->domain_id, dom_state_changed);
if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
evtchn_send(d, d->suspend_evtchn);
else
@@ -832,6 +849,7 @@ struct domain *domain_create(domid_t domid,
*/
domlist_insert(d);
+ set_bit(d->domain_id, dom_state_changed);
memcpy(d->handle, config->handle, sizeof(d->handle));
return d;
@@ -1097,6 +1115,7 @@ int domain_kill(struct domain *d)
/* Mem event cleanup has to go here because the rings
* have to be put before we call put_domain. */
vm_event_cleanup(d);
+ set_bit(d->domain_id, dom_state_changed);
put_domain(d);
send_global_virq(VIRQ_DOM_EXC);
/* fallthrough */
@@ -1286,6 +1305,8 @@ static void cf_check complete_domain_destroy(struct rcu_head *head)
xfree(d->vcpu);
+ set_bit(d->domain_id, dom_state_changed);
+
_domain_destroy(d);
send_global_virq(VIRQ_DOM_EXC);
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 8db2ca4ba2..9b87d29968 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
rc = evtchn_bind_virq(&bind_virq, 0);
if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
rc = -EFAULT; /* Cleaning up here would be a mess! */
+ if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
+ domain_reset_states();
break;
}
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1dd8a425f9..667863263d 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -800,6 +800,8 @@ void domain_resume(struct domain *d);
int domain_soft_reset(struct domain *d, bool resuming);
+void domain_reset_states(void);
+
int vcpu_start_shutdown_deferral(struct vcpu *v);
void vcpu_end_shutdown_deferral(struct vcpu *v);
--
2.43.0
On 23.10.2024 15:10, Juergen Gross wrote: > Add a bitmap with one bit per possible domid indicating the respective > domain has changed its state (created, deleted, dying, crashed, > shutdown). > > Registering the VIRQ_DOM_EXC event will result in setting the bits for > all existing domains and resetting all other bits. That's furthering the "there can be only one consumer" model that also is used for VIRQ_DOM_EXC itself. I consider the existing model flawed (nothing keeps a 2nd party with sufficient privilege from invoking XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification from whoever had first requested it), and hence I dislike this being extended. Conceivably multiple parties may indeed be interested in this kind of information. At which point resetting state when the vIRQ is bound is questionable (or the data would need to become per-domain rather than global, or even yet more fine-grained, albeit ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s). > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available; > > bool __read_mostly vpmu_is_available; > > +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1); While it won't alter the size of the array, I think DOMID_FIRST_RESERVED would be more logical to use here and ... > +void domain_reset_states(void) > +{ > + struct domain *d; > + > + bitmap_zero(dom_state_changed, DOMID_MASK + 1); ... here. > + rcu_read_lock(&domlist_read_lock); > + > + for_each_domain ( d ) > + set_bit(d->domain_id, dom_state_changed); d is used only here, so could be pointer-to-const? > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > rc = evtchn_bind_virq(&bind_virq, 0); > if ( !rc && __copy_to_guest(arg, &bind_virq, 1) ) > rc = -EFAULT; /* Cleaning up here would be a mess! */ > + if ( !rc && bind_virq.virq == VIRQ_DOM_EXC ) > + domain_reset_states(); evtchn_bind_virq() isn't static, so callers beyond the present ones could appear without noticing the need for this special casing. Is there a reason the check can't move into the function? Doing the check in spite of the copy-out failing is imo still reasonable behavior. Jan
On 31.10.24 11:59, Jan Beulich wrote: > On 23.10.2024 15:10, Juergen Gross wrote: >> Add a bitmap with one bit per possible domid indicating the respective >> domain has changed its state (created, deleted, dying, crashed, >> shutdown). >> >> Registering the VIRQ_DOM_EXC event will result in setting the bits for >> all existing domains and resetting all other bits. > > That's furthering the "there can be only one consumer" model that also > is used for VIRQ_DOM_EXC itself. I consider the existing model flawed > (nothing keeps a 2nd party with sufficient privilege from invoking > XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification > from whoever had first requested it), and hence I dislike this being > extended. Conceivably multiple parties may indeed be interested in > this kind of information. At which point resetting state when the vIRQ > is bound is questionable (or the data would need to become per-domain > rather than global, or even yet more fine-grained, albeit > ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s). The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that event which makes the consumer look into the bitmap via the new hypercall. If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to have one bitmap per consumer of the event. This is not very hard to modify. If you'd like that better, I can dynamically allocate the bitmap on binding VIRQ_DOM_EXC and freeing it again when unbinding is done. > >> --- a/xen/common/domain.c >> +++ b/xen/common/domain.c >> @@ -138,6 +138,22 @@ bool __read_mostly vmtrace_available; >> >> bool __read_mostly vpmu_is_available; >> >> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1); > > While it won't alter the size of the array, I think DOMID_FIRST_RESERVED > would be more logical to use here and ... > >> +void domain_reset_states(void) >> +{ >> + struct domain *d; >> + >> + bitmap_zero(dom_state_changed, DOMID_MASK + 1); > > ... here. Fine with me. > >> + rcu_read_lock(&domlist_read_lock); >> + >> + for_each_domain ( d ) >> + set_bit(d->domain_id, dom_state_changed); > > d is used only here, so could be pointer-to-const? Agreed. > >> --- a/xen/common/event_channel.c >> +++ b/xen/common/event_channel.c >> @@ -1296,6 +1296,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >> rc = evtchn_bind_virq(&bind_virq, 0); >> if ( !rc && __copy_to_guest(arg, &bind_virq, 1) ) >> rc = -EFAULT; /* Cleaning up here would be a mess! */ >> + if ( !rc && bind_virq.virq == VIRQ_DOM_EXC ) >> + domain_reset_states(); > > evtchn_bind_virq() isn't static, so callers beyond the present ones could > appear without noticing the need for this special casing. Is there a reason > the check can't move into the function? Doing the check in spite of the > copy-out failing is imo still reasonable behavior. Moving the test into evtchn_bind_virq() should work. I'll change that. Juergen
On 01.11.2024 07:48, Jürgen Groß wrote: > On 31.10.24 11:59, Jan Beulich wrote: >> On 23.10.2024 15:10, Juergen Gross wrote: >>> Add a bitmap with one bit per possible domid indicating the respective >>> domain has changed its state (created, deleted, dying, crashed, >>> shutdown). >>> >>> Registering the VIRQ_DOM_EXC event will result in setting the bits for >>> all existing domains and resetting all other bits. >> >> That's furthering the "there can be only one consumer" model that also >> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed >> (nothing keeps a 2nd party with sufficient privilege from invoking >> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification >> from whoever had first requested it), and hence I dislike this being >> extended. Conceivably multiple parties may indeed be interested in >> this kind of information. At which point resetting state when the vIRQ >> is bound is questionable (or the data would need to become per-domain >> rather than global, or even yet more fine-grained, albeit >> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s). > > The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that > event which makes the consumer look into the bitmap via the new hypercall. > > If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to > have one bitmap per consumer of the event. This is not very hard to > modify. > > If you'd like that better, I can dynamically allocate the bitmap on > binding VIRQ_DOM_EXC and freeing it again when unbinding is done. I'd prefer that indeed, yet I'm also curious what other maintainers think. Jan
Hi Jan & Juergen, On 04/11/2024 09:35, Jan Beulich wrote: > On 01.11.2024 07:48, Jürgen Groß wrote: >> On 31.10.24 11:59, Jan Beulich wrote: >>> On 23.10.2024 15:10, Juergen Gross wrote: >>>> Add a bitmap with one bit per possible domid indicating the respective >>>> domain has changed its state (created, deleted, dying, crashed, >>>> shutdown). >>>> >>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for >>>> all existing domains and resetting all other bits. >>> >>> That's furthering the "there can be only one consumer" model that also >>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed >>> (nothing keeps a 2nd party with sufficient privilege from invoking >>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification >>> from whoever had first requested it), and hence I dislike this being >>> extended. Conceivably multiple parties may indeed be interested in >>> this kind of information. At which point resetting state when the vIRQ >>> is bound is questionable (or the data would need to become per-domain >>> rather than global, or even yet more fine-grained, albeit >>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s). >> >> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that >> event which makes the consumer look into the bitmap via the new hypercall. >> >> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to >> have one bitmap per consumer of the event. This is not very hard to >> modify. While in principle I agree that having multiple consumers of VIRQ_DOM_EXC would be great. I have some scalability concern because now we would end up to have to update N bitmap every time. So we would need to put a limit to N. I don't think there is a good limit... So overall, I am not entirely convinced it is worth the trouble. Cheers, -- Julien Grall
On 16.11.24 12:01, Julien Grall wrote: > Hi Jan & Juergen, > > On 04/11/2024 09:35, Jan Beulich wrote: >> On 01.11.2024 07:48, Jürgen Groß wrote: >>> On 31.10.24 11:59, Jan Beulich wrote: >>>> On 23.10.2024 15:10, Juergen Gross wrote: >>>>> Add a bitmap with one bit per possible domid indicating the respective >>>>> domain has changed its state (created, deleted, dying, crashed, >>>>> shutdown). >>>>> >>>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for >>>>> all existing domains and resetting all other bits. >>>> >>>> That's furthering the "there can be only one consumer" model that also >>>> is used for VIRQ_DOM_EXC itself. I consider the existing model flawed >>>> (nothing keeps a 2nd party with sufficient privilege from invoking >>>> XEN_DOMCTL_set_virq_handler a 2nd time, taking away the notification >>>> from whoever had first requested it), and hence I dislike this being >>>> extended. Conceivably multiple parties may indeed be interested in >>>> this kind of information. At which point resetting state when the vIRQ >>>> is bound is questionable (or the data would need to become per-domain >>>> rather than global, or even yet more fine-grained, albeit >>>> ->virq_to_evtchn[] is also per-domain, when considering global vIRQ-s). >>> >>> The bitmap is directly tied to the VIRQ_DOM_EXC anyway, as it is that >>> event which makes the consumer look into the bitmap via the new hypercall. >>> >>> If we decide to allow multiple consumers of VIRQ_DOM_EXC, we'll need to >>> have one bitmap per consumer of the event. This is not very hard to >>> modify. > > While in principle I agree that having multiple consumers of VIRQ_DOM_EXC would > be great. I have some scalability concern because now we would end up to have to > update N bitmap every time. So we would need to put a limit to N. I don't think > there is a good limit... The same applies regarding sending an event. I don't think the additional setting of a bit is adding a relevant amount of processing time. I agree that a limit is hard to find, but it could be rather high. > So overall, I am not entirely convinced it is worth the trouble. The only real reason I could see would be a setup without Xenstore. Reason is that with Xenstore all interested parties could register a watch event instead of directly consuming the VIRQ_DOM_EXC event. We haven't needed multiple VIRQ_DOM_EXC consumers up to now, so I don't think we should over-engineer the interface. At least there is a theoretical solution for multiple consumers, so my patch series wouldn't introduce a no-go for multiple consumers. Juergen
© 2016 - 2024 Red Hat, Inc.