Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0.
This fixes using the Xen console which assumes domid 0 to use the
hypercall interface.
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
---
xen/arch/arm/dom0less-build.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index e539bcc762..5a7871939b 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -969,6 +969,7 @@ void __init create_domUs(void)
dt_for_each_child_node(chosen, node)
{
struct domain *d;
+ domid_t domid;
struct xen_domctl_createdomain d_cfg = {
.arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
.flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
@@ -1121,7 +1122,12 @@ void __init create_domUs(void)
* very important to use the pre-increment operator to call
* domain_create() with a domid > 0. (domid == 0 is reserved for Dom0)
*/
- d = domain_create(++max_init_domid, &d_cfg, flags);
+ if ( flags & CDF_hardware )
+ domid = 0;
+ else
+ domid = ++max_init_domid;
+
+ d = domain_create(domid, &d_cfg, flags);
if ( IS_ERR(d) )
panic("Error creating domain %s (rc = %ld)\n",
dt_node_name(node), PTR_ERR(d));
--
2.48.1
On 06.03.2025 23:03, Jason Andryuk wrote: > Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. > > This fixes using the Xen console which assumes domid 0 to use the > hypercall interface. Iirc a patch by Denis Mukhin is taking care of that, if what's meant is the input focus switching logic. Jan
Hi, On 06/03/2025 22:03, Jason Andryuk wrote: > Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. A few years ago, we went to great length to avoid making the assumption that the hardware domain is domid 0. See all the calls to "is_hardware_domain()". So I am reluctant to force the domain ID to 0. > > This fixes using the Xen console which assumes domid 0 to use the > hypercall interface. I had a brief look at drivers/char/console.c and I can't find any place assuming "domid 0". Do you have any pointer? > > Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> > --- > xen/arch/arm/dom0less-build.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index e539bcc762..5a7871939b 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -969,6 +969,7 @@ void __init create_domUs(void) > dt_for_each_child_node(chosen, node) > { > struct domain *d; > + domid_t domid; > struct xen_domctl_createdomain d_cfg = { > .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, > .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, > @@ -1121,7 +1122,12 @@ void __init create_domUs(void) > * very important to use the pre-increment operator to call > * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0) > */ > - d = domain_create(++max_init_domid, &d_cfg, flags); > + if ( flags & CDF_hardware ) > + domid = 0; > + else > + domid = ++max_init_domid; > + > + d = domain_create(domid, &d_cfg, flags); > if ( IS_ERR(d) ) > panic("Error creating domain %s (rc = %ld)\n", > dt_node_name(node), PTR_ERR(d)); Cheers, -- Julien Grall
On 2025-03-07 03:31, Julien Grall wrote: > Hi, > > On 06/03/2025 22:03, Jason Andryuk wrote: >> Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. > > A few years ago, we went to great length to avoid making the assumption > that the hardware domain is domid 0. See all the calls to > "is_hardware_domain()". So I am reluctant to force the domain ID to 0. I was disappointed when it didn't "just work". >> >> This fixes using the Xen console which assumes domid 0 to use the >> hypercall interface. > > I had a brief look at drivers/char/console.c and I can't find any place > assuming "domid 0". Do you have any pointer? As Jan pointed out, Denis Mukhin's patch removed the domid 0 assumption. This was developed without this patch when it mattered. I tested before posting without this patch (and with Denis's), and again now, and I didn't get a hwdom login. Turns out xenstored was assuming domid 0. Changing that with --master-domid gets to the login prompt. Still, there are now other userspace errors. xen-init-dom0 hardcodes domid 0 which doesn't exist. init-dom0less only initializes non-introduced domains, so hwdom doesn't get its "domid" xenstore node populated. That leads to other errors. So I think with Denis's patch, this isn't strictly needed. It does help existing toolstack code work today. Regards, Jason
Hi Jason, On 07/03/2025 16:03, Jason Andryuk wrote: > On 2025-03-07 03:31, Julien Grall wrote: >> Hi, >> >> On 06/03/2025 22:03, Jason Andryuk wrote: >>> Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. >> >> A few years ago, we went to great length to avoid making the >> assumption that the hardware domain is domid 0. See all the calls to >> "is_hardware_domain()". So I am reluctant to force the domain ID to 0. > > I was disappointed when it didn't "just work". > >>> >>> This fixes using the Xen console which assumes domid 0 to use the >>> hypercall interface. >> >> I had a brief look at drivers/char/console.c and I can't find any >> place assuming "domid 0". Do you have any pointer? > > As Jan pointed out, Denis Mukhin's patch removed the domid 0 assumption. > This was developed without this patch when it mattered. > > I tested before posting without this patch (and with Denis's), and again > now, and I didn't get a hwdom login. Turns out xenstored was assuming > domid 0. Changing that with --master-domid gets to the login prompt. Hmmm, I am not sure --master-domid should point to the hardware domain. Instead, it feels like it should be the control domain because it needs to manage the VMs so needs to create any nodes in Xenstore. > > Still, there are now other userspace errors. xen-init-dom0 hardcodes > domid 0 which doesn't exist. I am confused. Why would you call xen-init-dom0 in a domain that is meant to be the hardware domain rather than dom0? > init-dom0less only initializes non- > introduced domains, so hwdom doesn't get its "domid" xenstore node > populated. That leads to other errors. > > So I think with Denis's patch, this isn't strictly needed. It does help > existing toolstack code work today. I don't think the toolstack is ready for a split between control/hardware domain. That said, shouldn't the toolstack run in the control domain? Same for xenstored (unless you have a xenstored domain)? Cheers, -- Julien Grall
On Fri, 7 Mar 2025, Julien Grall wrote: > > init-dom0less only initializes non- introduced domains, so hwdom doesn't get > > its "domid" xenstore node populated. That leads to other errors. > > > So I think with Denis's patch, this isn't strictly needed. It does help > > existing toolstack code work today. > > I don't think the toolstack is ready for a split between control/hardware > domain. That said, shouldn't the toolstack run in the control domain? Same for > xenstored (unless you have a xenstored domain)? Yes, the toolstack (if present) would be in the control domain. xenstored doesn't have to be in the control domain and in fact it might not be advisable to place it there today. The main difference between the toolstack and xenstored is that the toolstack only talks to Xen, while xenstored talks to all other VMs, which is dangerous in many configurations.
Hi, On 08/03/2025 00:53, Stefano Stabellini wrote: > On Fri, 7 Mar 2025, Julien Grall wrote: >>> init-dom0less only initializes non- introduced domains, so hwdom doesn't get >>> its "domid" xenstore node populated. That leads to other errors. >>>> So I think with Denis's patch, this isn't strictly needed. It does help >>> existing toolstack code work today. >> >> I don't think the toolstack is ready for a split between control/hardware >> domain. That said, shouldn't the toolstack run in the control domain? Same for >> xenstored (unless you have a xenstored domain)? > > Yes, the toolstack (if present) would be in the control domain. > xenstored doesn't have to be in the control domain and in fact it might > not be advisable to place it there today. > > The main difference between the toolstack and xenstored is that the > toolstack only talks to Xen, while xenstored talks to all other VMs, > which is dangerous in many configurations. It is not really clear which toolstack you are referring to. Someone using vanilla Xen upstream will end up to use "xl" which has to talk to xenstored and also indirectly to each domain (e.g. shutdown/suspend node in xenstored). So for this setup, "xenstored" is not optional and I would argue should be part of the control domain (or in a xenstore stubdomain which IIRC is not supported on Arm today). Cheers, -- Julien Grall
On 2025-03-07 16:01, Julien Grall wrote: > Hi Jason, > > On 07/03/2025 16:03, Jason Andryuk wrote: >> On 2025-03-07 03:31, Julien Grall wrote: >>> Hi, >>> >>> On 06/03/2025 22:03, Jason Andryuk wrote: >>>> Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. >>> >>> A few years ago, we went to great length to avoid making the >>> assumption that the hardware domain is domid 0. See all the calls to >>> "is_hardware_domain()". So I am reluctant to force the domain ID to 0. >> >> I was disappointed when it didn't "just work". >> >>>> >>>> This fixes using the Xen console which assumes domid 0 to use the >>>> hypercall interface. >>> >>> I had a brief look at drivers/char/console.c and I can't find any >>> place assuming "domid 0". Do you have any pointer? >> >> As Jan pointed out, Denis Mukhin's patch removed the domid 0 >> assumption. This was developed without this patch when it mattered. >> >> I tested before posting without this patch (and with Denis's), and >> again now, and I didn't get a hwdom login. Turns out xenstored was >> assuming domid 0. Changing that with --master-domid gets to the login >> prompt. > > Hmmm, I am not sure --master-domid should point to the hardware domain. > Instead, it feels like it should be the control domain because it needs > to manage the VMs so needs to create any nodes in Xenstore. --master-domid encompasses "the domid where xenstored is running" (which really xenstored should figure out itself), and is needed for xenstored to start. There is an additional --priv-domid, which can point at the control domain. >> >> Still, there are now other userspace errors. xen-init-dom0 hardcodes >> domid 0 which doesn't exist. > > I am confused. Why would you call xen-init-dom0 in a domain that is > meant to be the hardware domain rather than dom0? I was using domid 0 :) Also, it's called by default in xencommons and sets up the cpupools. >> init-dom0less only initializes non- introduced domains, so hwdom >> doesn't get its "domid" xenstore node populated. That leads to other >> errors. > > > So I think with Denis's patch, this isn't strictly needed. It does > help >> existing toolstack code work today. > > I don't think the toolstack is ready for a split between control/ > hardware domain. That said, shouldn't the toolstack run in the control > domain? Same for xenstored (unless you have a xenstored domain)? Yes, maybe running control and xenstore together is better. I came from the perspective of dom0less with a hardware/control split, the toolstack is less important. But in general, it's all intertwined. You have to start somewhere untangling. Running xenstored in the hardware domain, and leaving hardware domain at domid 0 seemed like a good way to keep most things working while splitting out the hardware/control permissions. There was also this practical consideration: Xen: if ( is_hardware_domain(d) ) fi.submap |= 1U << XENFEAT_dom0; arch/arm/xen/enlighten.c if (xen_feature(XENFEAT_dom0)) xen_start_flags |= SIF_INITDOMAIN|SIF_PRIVILEGED; drivers/xen/xenfs/super.c if (xen_initial_domain()) tmp = "control_d\n"; So control_d is set in /proc/xen/capabilities for the hardware domain/initial domain. That is checked in xencommons. Sure, that can be worked around, but I was trying to get something going. I tried a combined control|xenstore domain, but it doesn't have /proc/xen/xsd_port, so xenstored fails: Failed to initialize dom0 port: No such file or directory Regards, Jason
On Fri, 7 Mar 2025, Jason Andryuk wrote: > On 2025-03-07 16:01, Julien Grall wrote: > > Hi Jason, > > > > On 07/03/2025 16:03, Jason Andryuk wrote: > > > On 2025-03-07 03:31, Julien Grall wrote: > > > > Hi, > > > > > > > > On 06/03/2025 22:03, Jason Andryuk wrote: > > > > > Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. > > > > > > > > A few years ago, we went to great length to avoid making the assumption > > > > that the hardware domain is domid 0. See all the calls to > > > > "is_hardware_domain()". So I am reluctant to force the domain ID to 0. > > > > > > I was disappointed when it didn't "just work". > > > > > > > > > > > > > This fixes using the Xen console which assumes domid 0 to use the > > > > > hypercall interface. > > > > > > > > I had a brief look at drivers/char/console.c and I can't find any place > > > > assuming "domid 0". Do you have any pointer? > > > > > > As Jan pointed out, Denis Mukhin's patch removed the domid 0 assumption. > > > This was developed without this patch when it mattered. > > > > > > I tested before posting without this patch (and with Denis's), and again > > > now, and I didn't get a hwdom login. Turns out xenstored was assuming > > > domid 0. Changing that with --master-domid gets to the login prompt. > > > > Hmmm, I am not sure --master-domid should point to the hardware domain. > > Instead, it feels like it should be the control domain because it needs to > > manage the VMs so needs to create any nodes in Xenstore. > > --master-domid encompasses "the domid where xenstored is running" (which > really xenstored should figure out itself), and is needed for xenstored to > start. > > There is an additional --priv-domid, which can point at the control domain. > > > > > > > Still, there are now other userspace errors. xen-init-dom0 hardcodes > > > domid 0 which doesn't exist. > > > > I am confused. Why would you call xen-init-dom0 in a domain that is meant to > > be the hardware domain rather than dom0? > > I was using domid 0 :) Also, it's called by default in xencommons and sets up > the cpupools. > > > > init-dom0less only initializes non- introduced domains, so hwdom doesn't > > > get its "domid" xenstore node populated. That leads to other errors. > > > > So I think with Denis's patch, this isn't strictly needed. It does > > help > > > existing toolstack code work today. > > > > I don't think the toolstack is ready for a split between control/ hardware > > domain. That said, shouldn't the toolstack run in the control domain? Same > > for xenstored (unless you have a xenstored domain)? > > Yes, maybe running control and xenstore together is better. I came from the > perspective of dom0less with a hardware/control split, the toolstack is less > important. > > But in general, it's all intertwined. You have to start somewhere untangling. > > Running xenstored in the hardware domain, and leaving hardware domain at domid > 0 seemed like a good way to keep most things working while splitting out the > hardware/control permissions. In my opinion, there are reasons for placing xenstored in the control domain and also reasons for placing it in the hardware domain. I think this is the kind of policy decision I would leave to the user. In an embedded environment where safety is a concern, it also depends on whether the user wants to keep xenstore only between non-safe VMs, in which case I would put it in the hardware domain so that the control domain is fully isolated and protected. xenstore could be a source of interference. Or whether the user wants to use xenstore also for safe VMs, in which case they are likely to reimplement xenstored in a way that is safety certified and as part of a safety certified OS such as Zephyr. In this type of configuration, it would make sense to have xenstored in the control domain. I would suggest to make the location of xenstored configurable, and only provide a sensible default when the user doesn't explicitly say. With the state of the codebase and protocols that we have today, I think we are not ready for a xenstore free-from-interference implementation, so it would be safer to leave xenstored in the hardware domain as default, but we should make also the other configurations possible. This is the first patch series aimed at untangling the whole thing, so not everything can be done here. For example, if the location of Xenstore is not configurable in this series, it may still be OK. We just need to be careful in the docs and interfaces not to bake it into an assumption. I would also say the same for domid == 0: while it would be ideal to avoid relying on domid == 0 for the hardware domain, we don't have to resolve everything immediately. This series already achieves a significant improvement by separating the hardware domain from the control domain, which is valuable in itself.
Hi Stefano, On 08/03/2025 00:40, Stefano Stabellini wrote: > On Fri, 7 Mar 2025, Jason Andryuk wrote: >> On 2025-03-07 16:01, Julien Grall wrote: >>> Hi Jason, >>> >>> On 07/03/2025 16:03, Jason Andryuk wrote: >>>> On 2025-03-07 03:31, Julien Grall wrote: >>>>> Hi, >>>>> >>>>> On 06/03/2025 22:03, Jason Andryuk wrote: >>>>>> Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. >>>>> >>>>> A few years ago, we went to great length to avoid making the assumption >>>>> that the hardware domain is domid 0. See all the calls to >>>>> "is_hardware_domain()". So I am reluctant to force the domain ID to 0. >>>> >>>> I was disappointed when it didn't "just work". >>>> >>>>>> >>>>>> This fixes using the Xen console which assumes domid 0 to use the >>>>>> hypercall interface. >>>>> >>>>> I had a brief look at drivers/char/console.c and I can't find any place >>>>> assuming "domid 0". Do you have any pointer? >>>> >>>> As Jan pointed out, Denis Mukhin's patch removed the domid 0 assumption. >>>> This was developed without this patch when it mattered. >>>> >>>> I tested before posting without this patch (and with Denis's), and again >>>> now, and I didn't get a hwdom login. Turns out xenstored was assuming >>>> domid 0. Changing that with --master-domid gets to the login prompt. >>> >>> Hmmm, I am not sure --master-domid should point to the hardware domain. >>> Instead, it feels like it should be the control domain because it needs to >>> manage the VMs so needs to create any nodes in Xenstore. >> >> --master-domid encompasses "the domid where xenstored is running" (which >> really xenstored should figure out itself), and is needed for xenstored to >> start. >> >> There is an additional --priv-domid, which can point at the control domain. >> >>>> >>>> Still, there are now other userspace errors. xen-init-dom0 hardcodes >>>> domid 0 which doesn't exist. >>> >>> I am confused. Why would you call xen-init-dom0 in a domain that is meant to >>> be the hardware domain rather than dom0? >> >> I was using domid 0 :) Also, it's called by default in xencommons and sets up >> the cpupools. >> >>>> init-dom0less only initializes non- introduced domains, so hwdom doesn't >>>> get its "domid" xenstore node populated. That leads to other errors. >>> > > So I think with Denis's patch, this isn't strictly needed. It does >>> help >>>> existing toolstack code work today. >>> >>> I don't think the toolstack is ready for a split between control/ hardware >>> domain. That said, shouldn't the toolstack run in the control domain? Same >>> for xenstored (unless you have a xenstored domain)? >> >> Yes, maybe running control and xenstore together is better. I came from the >> perspective of dom0less with a hardware/control split, the toolstack is less >> important. >> >> But in general, it's all intertwined. You have to start somewhere untangling. >> >> Running xenstored in the hardware domain, and leaving hardware domain at domid >> 0 seemed like a good way to keep most things working while splitting out the >> hardware/control permissions. > > In my opinion, there are reasons for placing xenstored in the control > domain and also reasons for placing it in the hardware domain. I think > this is the kind of policy decision I would leave to the user. I agree it should be a policy decision. But as a default setup, I think this is muddying the difference between the control domain and hardware domain. Today's toolstack can't work without xenstored. So intuitively, xenstored would belong to the control domain in a default setup. > > In an embedded environment where safety is a concern, it also depends on > whether the user wants to keep xenstore only between non-safe VMs, in > which case I would put it in the hardware domain so that the control > domain is fully isolated and protected. xenstore could be a source of > interference. So your hardware domain is not really an hardware domain, right? This is more a dom0 minus toolstack? If so, I think it might be helpful if you add a document explaining what a hardware domain really means with this series. > > Or whether the user wants to use xenstore also for safe VMs, in which > case they are likely to reimplement xenstored in a way that is safety > certified and as part of a safety certified OS such as Zephyr. In this > type of configuration, it would make sense to have xenstored in the > control domain. > > I would suggest to make the location of xenstored configurable, and only > provide a sensible default when the user doesn't explicitly say. With > the state of the codebase and protocols that we have today, I think we > are not ready for a xenstore free-from-interference implementation, so > it would be safer to leave xenstored in the hardware domain as default, > but we should make also the other configurations possible. > > This is the first patch series aimed at untangling the whole thing, > so not everything can be done here. For example, if the location of > Xenstore is not configurable in this series, it may still be OK. We > just need to be careful in the docs and interfaces not to bake it into > an assumption. > > I would also say the same for domid == 0: while it would be ideal to > avoid relying on domid == 0 for the hardware domain, we don't have to > resolve everything immediately. I have to disagree. If we start shipping Xen with "hwdomid == 0", it will be quite difficult to remove this behavior because new tooling (which may not be under our control) will likely start/continue to rely on it. So ... > This series already achieves a > significant improvement by separating the hardware domain from the > control domain, which is valuable in itself. ... while I agree this series is a good step, I don't want to have any version of Xen where hwdomid == 0 (even temporarily). Cheers, -- Julien Grall
On Thu, 6 Mar 2025, Jason Andryuk wrote: > Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. > > This fixes using the Xen console which assumes domid 0 to use the > hypercall interface. > > Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> I hope there is a check already in the code somewhere that triggers an error if multiple domains are created with domid 0 ? > --- > xen/arch/arm/dom0less-build.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index e539bcc762..5a7871939b 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -969,6 +969,7 @@ void __init create_domUs(void) > dt_for_each_child_node(chosen, node) > { > struct domain *d; > + domid_t domid; > struct xen_domctl_createdomain d_cfg = { > .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, > .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, > @@ -1121,7 +1122,12 @@ void __init create_domUs(void) > * very important to use the pre-increment operator to call > * domain_create() with a domid > 0. (domid == 0 is reserved for Dom0) > */ > - d = domain_create(++max_init_domid, &d_cfg, flags); > + if ( flags & CDF_hardware ) > + domid = 0; > + else > + domid = ++max_init_domid; > + > + d = domain_create(domid, &d_cfg, flags); > if ( IS_ERR(d) ) > panic("Error creating domain %s (rc = %ld)\n", > dt_node_name(node), PTR_ERR(d)); > -- > 2.48.1 >
On 2025-03-06 20:32, Stefano Stabellini wrote: > On Thu, 6 Mar 2025, Jason Andryuk wrote: >> Assign domid 0 to the hwdom. Normally, dom0less does not use domid 0. >> >> This fixes using the Xen console which assumes domid 0 to use the >> hypercall interface. >> >> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> > > I hope there is a check already in the code somewhere that triggers an > error if multiple domains are created with domid 0 ? Hmm, no, the existence check is in the domctl code and not in domain_create(). The next patch in the series, "xen/arm: Add capabilities to dom0less", adds a panic when trying to create a second hardware_domain. That check could be moved here to keep it all together. Regards, Jason
© 2016 - 2025 Red Hat, Inc.