First of all, rename the evtchn APIs:
* evtchn_destroy => evtchn_teardown
* evtchn_destroy_final => evtchn_destroy
Move both calls into appropriate positions in domain_teardown() and
_domain_destroy(), which avoids having different cleanup logic depending on
the the cause of the cleanup.
In particular, this avoids evtchn_teardown() (previously named
evtchn_destroy()) being called redundantly thousands of times on a typical
XEN_DOMCTL_destroydomain hypercall.
No net change in behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
RFC. While testing this, I observed this, after faking up an -ENOMEM in
dom0's construction:
(XEN) [2020-12-21 16:31:20] NX (Execute Disable) protection active
(XEN) [2020-12-21 16:33:04]
(XEN) [2020-12-21 16:33:04] ****************************************
(XEN) [2020-12-21 16:33:04] Panic on CPU 0:
(XEN) [2020-12-21 16:33:04] Error creating domain 0
(XEN) [2020-12-21 16:33:04] ****************************************
XSA-344 appears to have added nearly 2 minutes of wallclock time into the
domain_create() error path, which isn't ok.
Considering that event channels haven't even been initialised in this
particular scenario, it ought to take ~0 time. Even if event channels have
been initalised, none can be active as the domain isn't visible to the system.
---
xen/common/domain.c | 17 ++++++++---------
xen/common/event_channel.c | 8 ++++----
xen/include/xen/sched.h | 4 ++--
3 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ef1987335b..701747b9d9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -284,6 +284,8 @@ custom_param("extra_guest_irqs", parse_extra_guest_irqs);
*/
static int domain_teardown(struct domain *d)
{
+ int rc;
+
BUG_ON(!d->is_dying);
/*
@@ -313,6 +315,10 @@ static int domain_teardown(struct domain *d)
};
case 0:
+ rc = evtchn_teardown(d);
+ if ( rc )
+ return rc;
+
PROGRESS(done):
break;
@@ -335,6 +341,8 @@ static void _domain_destroy(struct domain *d)
BUG_ON(!d->is_dying);
BUG_ON(atomic_read(&d->refcnt) != DOMAIN_DESTROYED);
+ evtchn_destroy(d);
+
xfree(d->pbuf);
argo_destroy(d);
@@ -598,11 +606,7 @@ struct domain *domain_create(domid_t domid,
if ( init_status & INIT_gnttab )
grant_table_destroy(d);
if ( init_status & INIT_evtchn )
- {
- evtchn_destroy(d);
- evtchn_destroy_final(d);
radix_tree_destroy(&d->pirq_tree, free_pirq_struct);
- }
if ( init_status & INIT_watchdog )
watchdog_domain_destroy(d);
@@ -792,9 +796,6 @@ int domain_kill(struct domain *d)
rc = domain_teardown(d);
if ( rc )
break;
- rc = evtchn_destroy(d);
- if ( rc )
- break;
rc = domain_relinquish_resources(d);
if ( rc != 0 )
break;
@@ -987,8 +988,6 @@ static void complete_domain_destroy(struct rcu_head *head)
if ( d->target != NULL )
put_domain(d->target);
- evtchn_destroy_final(d);
-
radix_tree_destroy(&d->pirq_tree, free_pirq_struct);
xfree(d->vcpu);
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 4a48094356..c1af54eed5 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1401,7 +1401,7 @@ void free_xen_event_channel(struct domain *d, int port)
{
/*
* Make sure ->is_dying is read /after/ ->valid_evtchns, pairing
- * with the spin_barrier() and BUG_ON() in evtchn_destroy().
+ * with the spin_barrier() and BUG_ON() in evtchn_teardown().
*/
smp_rmb();
BUG_ON(!d->is_dying);
@@ -1421,7 +1421,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
{
/*
* Make sure ->is_dying is read /after/ ->valid_evtchns, pairing
- * with the spin_barrier() and BUG_ON() in evtchn_destroy().
+ * with the spin_barrier() and BUG_ON() in evtchn_teardown().
*/
smp_rmb();
ASSERT(ld->is_dying);
@@ -1499,7 +1499,7 @@ int evtchn_init(struct domain *d, unsigned int max_port)
return 0;
}
-int evtchn_destroy(struct domain *d)
+int evtchn_teardown(struct domain *d)
{
unsigned int i;
@@ -1534,7 +1534,7 @@ int evtchn_destroy(struct domain *d)
}
-void evtchn_destroy_final(struct domain *d)
+void evtchn_destroy(struct domain *d)
{
unsigned int i, j;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 3f35c537b8..bb22eeca38 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -142,8 +142,8 @@ struct evtchn
} __attribute__((aligned(64)));
int evtchn_init(struct domain *d, unsigned int max_port);
-int evtchn_destroy(struct domain *d); /* from domain_kill */
-void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */
+int evtchn_teardown(struct domain *d);
+void evtchn_destroy(struct domain *d);
struct waitqueue_vcpu;
--
2.11.0
On 21.12.2020 19:14, Andrew Cooper wrote: > First of all, rename the evtchn APIs: > * evtchn_destroy => evtchn_teardown > * evtchn_destroy_final => evtchn_destroy I wonder in how far this is going to cause confusion with backports down the road. May I suggest to do only the first of the two renames, at least until in a couple of year's time? Or make the second rename to e.g. evtchn_cleanup() or evtchn_deinit()? > RFC. While testing this, I observed this, after faking up an -ENOMEM in > dom0's construction: > > (XEN) [2020-12-21 16:31:20] NX (Execute Disable) protection active > (XEN) [2020-12-21 16:33:04] > (XEN) [2020-12-21 16:33:04] **************************************** > (XEN) [2020-12-21 16:33:04] Panic on CPU 0: > (XEN) [2020-12-21 16:33:04] Error creating domain 0 > (XEN) [2020-12-21 16:33:04] **************************************** > > XSA-344 appears to have added nearly 2 minutes of wallclock time into the > domain_create() error path, which isn't ok. > > Considering that event channels haven't even been initialised in this > particular scenario, it ought to take ~0 time. Even if event channels have > been initalised, none can be active as the domain isn't visible to the system. evtchn_init() sets d->valid_evtchns to EVTCHNS_PER_BUCKET. Are you suggesting cleaning up one bucket's worth of unused event channels takes two minutes? If this is really the case, and considering there could at most be unbound Xen channels, perhaps we could avoid calling evtchn_teardown() from domain_create()'s error path? We'd need to take care of the then missing accounting (->active_evtchns and ->xen_evtchns), but this should be doable. Jan
On 22/12/2020 10:48, Jan Beulich wrote: > On 21.12.2020 19:14, Andrew Cooper wrote: >> First of all, rename the evtchn APIs: >> * evtchn_destroy => evtchn_teardown >> * evtchn_destroy_final => evtchn_destroy > I wonder in how far this is going to cause confusion with backports > down the road. May I suggest to do only the first of the two renames, > at least until in a couple of year's time? Or make the second rename > to e.g. evtchn_cleanup() or evtchn_deinit()? I considered backports, but I don't think it will be an issue. The contents of the two functions are very different, and we're not likely to be moving the callers in backports. I'm not fussed about the exact naming, so long as we can make and agreement and adhere to it strictly. The current APIs are a total mess. I used teardown/destroy because that seems to be one common theme in the APIs, but it will require some to change their name. >> RFC. While testing this, I observed this, after faking up an -ENOMEM in >> dom0's construction: >> >> (XEN) [2020-12-21 16:31:20] NX (Execute Disable) protection active >> (XEN) [2020-12-21 16:33:04] >> (XEN) [2020-12-21 16:33:04] **************************************** >> (XEN) [2020-12-21 16:33:04] Panic on CPU 0: >> (XEN) [2020-12-21 16:33:04] Error creating domain 0 >> (XEN) [2020-12-21 16:33:04] **************************************** >> >> XSA-344 appears to have added nearly 2 minutes of wallclock time into the >> domain_create() error path, which isn't ok. >> >> Considering that event channels haven't even been initialised in this >> particular scenario, it ought to take ~0 time. Even if event channels have >> been initalised, none can be active as the domain isn't visible to the system. > evtchn_init() sets d->valid_evtchns to EVTCHNS_PER_BUCKET. Are you > suggesting cleaning up one bucket's worth of unused event channels > takes two minutes? If this is really the case, and considering there > could at most be unbound Xen channels, perhaps we could avoid > calling evtchn_teardown() from domain_create()'s error path? We'd > need to take care of the then missing accounting (->active_evtchns > and ->xen_evtchns), but this should be doable. Actually, its a bug in this patch. evtchn_init() hasn't been called, so d->valid_evtchns is 0, so the loop is 2^32 iterations long. Luckily, this is easy to fix. As for avoiding calling, specifically not. Part of the problem we're trying to fix is that we've got two different destruction paths, and making domain_teardown() be fully idempotent is key to fixing that. ~Andrew
On 22.12.2020 12:28, Andrew Cooper wrote: > On 22/12/2020 10:48, Jan Beulich wrote: >> On 21.12.2020 19:14, Andrew Cooper wrote: >>> First of all, rename the evtchn APIs: >>> * evtchn_destroy => evtchn_teardown >>> * evtchn_destroy_final => evtchn_destroy >> I wonder in how far this is going to cause confusion with backports >> down the road. May I suggest to do only the first of the two renames, >> at least until in a couple of year's time? Or make the second rename >> to e.g. evtchn_cleanup() or evtchn_deinit()? > > I considered backports, but I don't think it will be an issue. The > contents of the two functions are very different, and we're not likely > to be moving the callers in backports. Does the same also apply to the old and new call sites of the functions? > I'm not fussed about the exact naming, so long as we can make and > agreement and adhere to it strictly. The current APIs are a total mess. > > I used teardown/destroy because that seems to be one common theme in the > APIs, but it will require some to change their name. So for domains "teardown" and "destroy" pair up with "create". I don't think evtchn_create() is a sensible name (the function doesn't really "create" anything); evtchn_init() seems quite a bit better to me, and hence evtchn_deinit() could be its counterpart. In particular I don't think all smaller entity functions involved in doing "xyz" for a larger entity need to have "xyz" in their names. Jan
On 22/12/2020 11:52, Jan Beulich wrote: > On 22.12.2020 12:28, Andrew Cooper wrote: >> On 22/12/2020 10:48, Jan Beulich wrote: >>> On 21.12.2020 19:14, Andrew Cooper wrote: >>>> First of all, rename the evtchn APIs: >>>> * evtchn_destroy => evtchn_teardown >>>> * evtchn_destroy_final => evtchn_destroy >>> I wonder in how far this is going to cause confusion with backports >>> down the road. May I suggest to do only the first of the two renames, >>> at least until in a couple of year's time? Or make the second rename >>> to e.g. evtchn_cleanup() or evtchn_deinit()? >> I considered backports, but I don't think it will be an issue. The >> contents of the two functions are very different, and we're not likely >> to be moving the callers in backports. > Does the same also apply to the old and new call sites of the functions? I don't understand your question. I don't intend the new callsites to ever move again, now they're part of the properly idempotent path, and any movement in the older trees would be wrong for anything other than backporting this fix, which clearly isn't a backport candidate. (That said - there's a memory leak I need to create a backport for...) >> I'm not fussed about the exact naming, so long as we can make and >> agreement and adhere to it strictly. The current APIs are a total mess. >> >> I used teardown/destroy because that seems to be one common theme in the >> APIs, but it will require some to change their name. > So for domains "teardown" and "destroy" pair up with "create". I don't > think evtchn_create() is a sensible name (the function doesn't really > "create" anything); evtchn_init() seems quite a bit better to me, and > hence evtchn_deinit() could be its counterpart. You're never going to find a perfect name for all cases, and in this proposal, you've still got evtchn_init/deinit/destroy() as a triple. What we do need is some clear rules, which will live in the forthcoming "lifecycle of a domain" document. > In particular I don't > think all smaller entity functions involved in doing "xyz" for a > larger entity need to have "xyz" in their names. While in principle I agree, I'm not sure the value if having perfect names outweighs the value of having a simple set of guidelines. ~Andrew
On 22.12.2020 14:33, Andrew Cooper wrote: > On 22/12/2020 11:52, Jan Beulich wrote: >> On 22.12.2020 12:28, Andrew Cooper wrote: >>> On 22/12/2020 10:48, Jan Beulich wrote: >>>> On 21.12.2020 19:14, Andrew Cooper wrote: >>>>> First of all, rename the evtchn APIs: >>>>> * evtchn_destroy => evtchn_teardown >>>>> * evtchn_destroy_final => evtchn_destroy >>>> I wonder in how far this is going to cause confusion with backports >>>> down the road. May I suggest to do only the first of the two renames, >>>> at least until in a couple of year's time? Or make the second rename >>>> to e.g. evtchn_cleanup() or evtchn_deinit()? >>> I considered backports, but I don't think it will be an issue. The >>> contents of the two functions are very different, and we're not likely >>> to be moving the callers in backports. >> Does the same also apply to the old and new call sites of the functions? > > I don't understand your question. I don't intend the new callsites to > ever move again, now they're part of the properly idempotent path, and > any movement in the older trees would be wrong for anything other than > backporting this fix, which clearly isn't a backport candidate. > > (That said - there's a memory leak I need to create a backport for...) My thinking was that call sites of functions also serve as references or anchors when you do backports. Having identically named functions with different purposes may be misleading people - both ones doing backports on a very occasional basis, but also us who may be doing this regularly, but only on halfway recent trees. I, for one, keep forgetting to check for bool/true/false when moving to 4.7, or the -ERESTART <=> -EAGAIN change after 4.4(?). For the former I'll be saved by the compiler yelling at me, but for the latter one needs to recognize the need for an adjustment. I'm afraid of the same thing (granted at a lower probability) potentially happening here, down the road. Jan
© 2016 - 2026 Red Hat, Inc.