A toolstack is expected to use XEN_DOMCTL_hypercall_init where applicable to
construct a new guest, but is absolutely not expected to use it against
itself. Kernels have a stable ABI for accessing the same functionality, via
MSR 0x40000000.
Found when auditing hypercalls for Host UEFI-SecureBoot safety.
Reported-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Anthony PERARD <anthony.perard@vates.tech>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Frediano Ziglio <frediano.ziglio@cloud.com>
---
xen/arch/x86/domctl.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 3044f706de1c..bf1ee4ed51a0 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -372,6 +372,14 @@ long arch_do_domctl(
struct page_info *page;
void *hypercall_page;
+ /*
+ * Kernels should use the MSR method to get a hypercall page. The
+ * toolstack should not be using the DOMCTL on itself.
+ */
+ ret = -EINVAL;
+ if ( d == currd )
+ break;
+
page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
if ( !page || !get_page_type(page, PGT_writable_page) )
base-commit: 68797a710f4e91cc09fe5650ee14478316010f88
--
2.39.5
On 06.08.2025 18:55, Andrew Cooper wrote: > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -372,6 +372,14 @@ long arch_do_domctl( > struct page_info *page; > void *hypercall_page; > > + /* > + * Kernels should use the MSR method to get a hypercall page. The > + * toolstack should not be using the DOMCTL on itself. > + */ > + ret = -EINVAL; > + if ( d == currd ) > + break; Isn't what the comment says more generally true? To act on themselves, most domctl-s are inappropriate to use, I think. There are a few exceptions, where alternatives simply don't exist (and where, if a kernel wanted to use a domctl [or sysctl], it would need to go through hoops to deal with the interface versioning). Yet there's still the question of whether we shouldn't apply this restriction in a broader fashion. Jan
On 07/08/2025 10:30 am, Jan Beulich wrote: > On 06.08.2025 18:55, Andrew Cooper wrote: >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -372,6 +372,14 @@ long arch_do_domctl( >> struct page_info *page; >> void *hypercall_page; >> >> + /* >> + * Kernels should use the MSR method to get a hypercall page. The >> + * toolstack should not be using the DOMCTL on itself. >> + */ >> + ret = -EINVAL; >> + if ( d == currd ) >> + break; > Isn't what the comment says more generally true? To act on themselves, most > domctl-s are inappropriate to use, I think. There are a few exceptions, where > alternatives simply don't exist (and where, if a kernel wanted to use a domctl > [or sysctl], it would need to go through hoops to deal with the interface > versioning). Yet there's still the question of whether we shouldn't apply this > restriction in a broader fashion. I'd go so far as to say that domctls ought to be restricted against oneself, but it's not quite that easy. The majority of them already are restricted because of domain/vcpu_pause(), but this is all ad-hoc. In principle, the control domain ought to be able to issue some of the getter's on itself, but even that's inconsistent. get_dominfo is permitted, but paging ops are not. This is still a TBD for the stable tools interfaces. One option I am considering was to split the opcode space by whether it was logically a getter or setter, but I expect this does not work nicely if we also want to retain backwards compatibility. ~Andrew
On 07.08.2025 12:17, Andrew Cooper wrote: > On 07/08/2025 10:30 am, Jan Beulich wrote: >> On 06.08.2025 18:55, Andrew Cooper wrote: >>> --- a/xen/arch/x86/domctl.c >>> +++ b/xen/arch/x86/domctl.c >>> @@ -372,6 +372,14 @@ long arch_do_domctl( >>> struct page_info *page; >>> void *hypercall_page; >>> >>> + /* >>> + * Kernels should use the MSR method to get a hypercall page. The >>> + * toolstack should not be using the DOMCTL on itself. >>> + */ >>> + ret = -EINVAL; >>> + if ( d == currd ) >>> + break; >> Isn't what the comment says more generally true? To act on themselves, most >> domctl-s are inappropriate to use, I think. There are a few exceptions, where >> alternatives simply don't exist (and where, if a kernel wanted to use a domctl >> [or sysctl], it would need to go through hoops to deal with the interface >> versioning). Yet there's still the question of whether we shouldn't apply this >> restriction in a broader fashion. > > I'd go so far as to say that domctls ought to be restricted against > oneself, but it's not quite that easy. The majority of them already are > restricted because of domain/vcpu_pause(), but this is all ad-hoc. > > In principle, the control domain ought to be able to issue some of the > getter's on itself, but even that's inconsistent. get_dominfo is > permitted, but paging ops are not. Hmm, right. And as said, there are a few which simply can't be done a non- domctl way. For the patch here: Reviewed-by: Jan Beulich <jbeulich@suse.com> > This is still a TBD for the stable tools interfaces. One option I am > considering was to split the opcode space by whether it was logically a > getter or setter, but I expect this does not work nicely if we also want > to retain backwards compatibility. I didn't expect we'd strive for (binary) backwards compatibility there. Source compatibility (i.e. merely requiring a re-compile) may be a goal, yet even there I wouldn't be certain. Jan
On 06/08/2025 5:55 pm, Andrew Cooper wrote: > A toolstack is expected to use XEN_DOMCTL_hypercall_init where applicable to > construct a new guest, but is absolutely not expected to use it against > itself. Kernels have a stable ABI for accessing the same functionality, via > MSR 0x40000000. > > Found when auditing hypercalls for Host UEFI-SecureBoot safety. > > Reported-by: Frediano Ziglio <frediano.ziglio@cloud.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Anthony PERARD <anthony.perard@vates.tech> > CC: Michal Orzel <michal.orzel@amd.com> > CC: Jan Beulich <jbeulich@suse.com> > CC: Julien Grall <julien@xen.org> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Ross Lagerwall <ross.lagerwall@citrix.com> > CC: Frediano Ziglio <frediano.ziglio@cloud.com> It's worth nothing that the observations which lead to this mean it's impossible to support multiple TCB domains without Flask. Imagine that we did have two control domains and they could operate on each other. It's inappropriate for the domain's kernel to be auditing which domid's are which privilege. Flask can let Xen express that level of control, and in practice the system would need configuring with a privilege hierarchy. Equally-powerful domains which can operate on each other simply doesn't make sense from a system point of view. But, as Flask is not supported yet in Xen, the Host UEFI-SB security policy is going to have to be limited to a single all-power dom0 in the short term. ~Andrew
© 2016 - 2025 Red Hat, Inc.