xen/arch/x86/domctl.c | 4 ++++ xen/arch/x86/mm/paging.c | 8 ++++++-- xen/common/domctl.c | 13 +++++++++++++ 3 files changed, 23 insertions(+), 2 deletions(-)
When performing multiple migrations in parallel, the domctl lock may become extremely contended: * Operations like "xl vcpu-list" were observed to take in excess of 20s to execute. * The "clean" shadow op may pause the domain, restart with a continuation and then become blocked on the domctl lock, causing VM downtime in excess of 20 seconds. These issues can be fixed by not holding the domctl for the frequently called operations during migration. Thanks Ross Lagerwall (2): domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock xen/arch/x86/domctl.c | 4 ++++ xen/arch/x86/mm/paging.c | 8 ++++++-- xen/common/domctl.c | 13 +++++++++++++ 3 files changed, 23 insertions(+), 2 deletions(-) -- 2.53.0
On 09.06.2026 17:15, Ross Lagerwall wrote: > When performing multiple migrations in parallel, the domctl lock may > become extremely contended: > > * Operations like "xl vcpu-list" were observed to take in excess of 20s > to execute. Does "xl vcpu-list" involve ... > * The "clean" shadow op may pause the domain, restart with a > continuation and then become blocked on the domctl lock, causing VM > downtime in excess of 20 seconds. > > These issues can be fixed by not holding the domctl for the frequently > called operations during migration. > > Thanks > > Ross Lagerwall (2): > domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock ... XEN_DOMCTL_getpageframeinfo3? Jan > domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock > > xen/arch/x86/domctl.c | 4 ++++ > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 13 +++++++++++++ > 3 files changed, 23 insertions(+), 2 deletions(-) >
On 6/11/26 3:55 PM, Jan Beulich wrote: > On 09.06.2026 17:15, Ross Lagerwall wrote: >> When performing multiple migrations in parallel, the domctl lock may >> become extremely contended: >> >> * Operations like "xl vcpu-list" were observed to take in excess of 20s >> to execute. > > Does "xl vcpu-list" involve ... > >> * The "clean" shadow op may pause the domain, restart with a >> continuation and then become blocked on the domctl lock, causing VM >> downtime in excess of 20 seconds. >> >> These issues can be fixed by not holding the domctl for the frequently >> called operations during migration. >> >> Thanks >> >> Ross Lagerwall (2): >> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock > > ... XEN_DOMCTL_getpageframeinfo3? > No, but "xl vcpu-list" takes the domctl lock and this contends with XEN_DOMCTL_getpageframeinfo3 and XEN_DOMCTL_shadow_op taking the domctl lock which are called frequently by the migration process(es). Various other operations were slow due to the domctl lock contention but "xl vcpu-list" was the most obviously visible example. Ross
On 11.06.2026 18:02, Ross Lagerwall wrote: > On 6/11/26 3:55 PM, Jan Beulich wrote: >> On 09.06.2026 17:15, Ross Lagerwall wrote: >>> When performing multiple migrations in parallel, the domctl lock may >>> become extremely contended: >>> >>> * Operations like "xl vcpu-list" were observed to take in excess of 20s >>> to execute. >> >> Does "xl vcpu-list" involve ... >> >>> * The "clean" shadow op may pause the domain, restart with a >>> continuation and then become blocked on the domctl lock, causing VM >>> downtime in excess of 20 seconds. >>> >>> These issues can be fixed by not holding the domctl for the frequently >>> called operations during migration. >>> >>> Thanks >>> >>> Ross Lagerwall (2): >>> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock >> >> ... XEN_DOMCTL_getpageframeinfo3? >> > > No, but "xl vcpu-list" takes the domctl lock If this is still the case after XSA-492, then maybe the follow-ups I have pending to post will eliminate (or at least reduce) this. I don't think that's 4.22 material, though. > and this contends with > XEN_DOMCTL_getpageframeinfo3 and XEN_DOMCTL_shadow_op taking the domctl lock > which are called frequently by the migration process(es). > > Various other operations were slow due to the domctl lock contention but "xl > vcpu-list" was the most obviously visible example. I see. Jan
On 6/9/26 4:15 PM, Ross Lagerwall wrote: > When performing multiple migrations in parallel, the domctl lock may > become extremely contended: > > * Operations like "xl vcpu-list" were observed to take in excess of 20s > to execute. > * The "clean" shadow op may pause the domain, restart with a > continuation and then become blocked on the domctl lock, causing VM > downtime in excess of 20 seconds. > > These issues can be fixed by not holding the domctl for the frequently > called operations during migration. > > Thanks > > Ross Lagerwall (2): > domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock > domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock > > xen/arch/x86/domctl.c | 4 ++++ > xen/arch/x86/mm/paging.c | 8 ++++++-- > xen/common/domctl.c | 13 +++++++++++++ > 3 files changed, 23 insertions(+), 2 deletions(-) > I'd like to request inclusion of this in 4.22 since it fixes a real customer issue we have observed and would have been posted some time ago but was delayed to avoid drawing attention to and colliding with XSA-492. Thanks, Ross
On 6/10/26 11:57 AM, Ross Lagerwall wrote: > On 6/9/26 4:15 PM, Ross Lagerwall wrote: >> When performing multiple migrations in parallel, the domctl lock may >> become extremely contended: >> >> * Operations like "xl vcpu-list" were observed to take in excess of 20s >> to execute. >> * The "clean" shadow op may pause the domain, restart with a >> continuation and then become blocked on the domctl lock, causing VM >> downtime in excess of 20 seconds. >> >> These issues can be fixed by not holding the domctl for the frequently >> called operations during migration. >> >> Thanks >> >> Ross Lagerwall (2): >> domctl: Handle XEN_DOMCTL_getpageframeinfo3 without the domctl lock >> domctl: Handle some of XEN_DOMCTL_shadow_op without the domctl lock >> >> xen/arch/x86/domctl.c | 4 ++++ >> xen/arch/x86/mm/paging.c | 8 ++++++-- >> xen/common/domctl.c | 13 +++++++++++++ >> 3 files changed, 23 insertions(+), 2 deletions(-) >> > > I'd like to request inclusion of this in 4.22 since it fixes a real > customer issue we have observed and would have been posted some time ago > but was delayed to avoid drawing attention to and colliding with > XSA-492. Considering this and performance improvements it would be really nice to have in in 4.22: Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Thanks. ~ Oleksii
© 2016 - 2026 Red Hat, Inc.