From: Luca Miccio <lucmiccio@gmail.com>
The xenstore event channel will be allocated for dom0less domains. It is
necessary to have access to the evtchn_alloc_unbound function to do
that, so make evtchn_alloc_unbound public.
Add a skip_xsm parameter to allow disabling the XSM check in
evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
originated from Xen before running any domains.)
Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Wei Liu <wl@xen.org>
---
Changes v3:
- expose evtchn_alloc_unbound, assing a skip_xsm parameter
---
xen/common/event_channel.c | 13 ++++++++-----
xen/include/xen/event.h | 3 +++
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index da88ad141a..be57d00a15 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
xsm_evtchn_close_post(chn);
}
-static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
{
struct evtchn *chn;
struct domain *d;
@@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
ERROR_EXIT_DOM(port, d);
chn = evtchn_from_port(d, port);
- rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
- if ( rc )
- goto out;
+ if ( !skip_xsm )
+ {
+ rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
+ if ( rc )
+ goto out;
+ }
evtchn_write_lock(chn);
@@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
struct evtchn_alloc_unbound alloc_unbound;
if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 )
return -EFAULT;
- rc = evtchn_alloc_unbound(&alloc_unbound);
+ rc = evtchn_alloc_unbound(&alloc_unbound, false);
if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) )
rc = -EFAULT; /* Cleaning up here would be a mess! */
break;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 21c95e14fd..0a2cdedf7d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
/* Free an event channel. */
void evtchn_free(struct domain *d, struct evtchn *chn);
+/* Create a new event channel port */
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
+
/* Allocate a specific event channel port. */
int evtchn_allocate_port(struct domain *d, unsigned int port);
--
2.25.1
On 28.01.2022 22:33, Stefano Stabellini wrote: > From: Luca Miccio <lucmiccio@gmail.com> > > The xenstore event channel will be allocated for dom0less domains. It is > necessary to have access to the evtchn_alloc_unbound function to do > that, so make evtchn_alloc_unbound public. > > Add a skip_xsm parameter to allow disabling the XSM check in > evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call > originated from Xen before running any domains.) What I continue to be missing here are some words on why it is okay to skip XSM checking in this case. After all an alternative would be to put in place a suitable domain as "current" one for the check to actually work. Jan
On 1/28/22 16:33, Stefano Stabellini wrote: > From: Luca Miccio <lucmiccio@gmail.com> > > The xenstore event channel will be allocated for dom0less domains. It is > necessary to have access to the evtchn_alloc_unbound function to do > that, so make evtchn_alloc_unbound public. > > Add a skip_xsm parameter to allow disabling the XSM check in > evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call > originated from Xen before running any domains.) > > Signed-off-by: Luca Miccio <lucmiccio@gmail.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> > CC: Julien Grall <julien@xen.org> > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> > CC: Bertrand Marquis <bertrand.marquis@arm.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: George Dunlap <george.dunlap@citrix.com> > CC: Jan Beulich <jbeulich@suse.com> > CC: Wei Liu <wl@xen.org> > --- > Changes v3: > - expose evtchn_alloc_unbound, assing a skip_xsm parameter > --- > xen/common/event_channel.c | 13 ++++++++----- > xen/include/xen/event.h | 3 +++ > 2 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > index da88ad141a..be57d00a15 100644 > --- a/xen/common/event_channel.c > +++ b/xen/common/event_channel.c > @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) > xsm_evtchn_close_post(chn); > } > > -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > { > struct evtchn *chn; > struct domain *d; > @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > ERROR_EXIT_DOM(port, d); > chn = evtchn_from_port(d, port); > > - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > - if ( rc ) > - goto out; > + if ( !skip_xsm ) > + { > + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > + if ( rc ) > + goto out; > + } Please do not subvert the security framework because it causes an inconvenience. As Jan recommended, work within the XSM check to allow your access so that we may ensure it is done safely. If you need any help making modifications to XSM, please do not hesitate to reach out as I will gladly help. > evtchn_write_lock(chn); > > @@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > struct evtchn_alloc_unbound alloc_unbound; > if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 ) > return -EFAULT; > - rc = evtchn_alloc_unbound(&alloc_unbound); > + rc = evtchn_alloc_unbound(&alloc_unbound, false); > if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) ) > rc = -EFAULT; /* Cleaning up here would be a mess! */ > break; > diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h > index 21c95e14fd..0a2cdedf7d 100644 > --- a/xen/include/xen/event.h > +++ b/xen/include/xen/event.h > @@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest); > /* Free an event channel. */ > void evtchn_free(struct domain *d, struct evtchn *chn); > > +/* Create a new event channel port */ > +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm); > + > /* Allocate a specific event channel port. */ > int evtchn_allocate_port(struct domain *d, unsigned int port); >
On Tue, 15 Mar 2022, Daniel P. Smith wrote: > On 1/28/22 16:33, Stefano Stabellini wrote: > > From: Luca Miccio <lucmiccio@gmail.com> > > > > The xenstore event channel will be allocated for dom0less domains. It is > > necessary to have access to the evtchn_alloc_unbound function to do > > that, so make evtchn_alloc_unbound public. > > > > Add a skip_xsm parameter to allow disabling the XSM check in > > evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call > > originated from Xen before running any domains.) > > > > Signed-off-by: Luca Miccio <lucmiccio@gmail.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> > > CC: Julien Grall <julien@xen.org> > > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> > > CC: Bertrand Marquis <bertrand.marquis@arm.com> > > CC: Andrew Cooper <andrew.cooper3@citrix.com> > > CC: George Dunlap <george.dunlap@citrix.com> > > CC: Jan Beulich <jbeulich@suse.com> > > CC: Wei Liu <wl@xen.org> > > --- > > Changes v3: > > - expose evtchn_alloc_unbound, assing a skip_xsm parameter > > --- > > xen/common/event_channel.c | 13 ++++++++----- > > xen/include/xen/event.h | 3 +++ > > 2 files changed, 11 insertions(+), 5 deletions(-) > > > > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > > index da88ad141a..be57d00a15 100644 > > --- a/xen/common/event_channel.c > > +++ b/xen/common/event_channel.c > > @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) > > xsm_evtchn_close_post(chn); > > } > > > > -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > > +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > > { > > struct evtchn *chn; > > struct domain *d; > > @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > > ERROR_EXIT_DOM(port, d); > > chn = evtchn_from_port(d, port); > > > > - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > > - if ( rc ) > > - goto out; > > + if ( !skip_xsm ) > > + { > > + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > > + if ( rc ) > > + goto out; > > + } > > Please do not subvert the security framework because it causes an > inconvenience. As Jan recommended, work within the XSM check to allow > your access so that we may ensure it is done safely. If you need any > help making modifications to XSM, please do not hesitate to reach out as > I will gladly help. Thank you! First let me reply to Jan: this series is only introducing 1 more call to evtchn_alloc_unbound, which is to allocate the special xenstore event channel, the one configured via d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. It is not meant to be a generic function, and it is not meant to be called more than once. It could (should?) be __init. The existing XSM check in evtchn_alloc_unbound cannot work and should not work: it is based on the current domain having enough privileges to create the event channel. In this case, we have no current domain at all. The current domain is Xen itself. For these reasons, given [1], also not to subvert the security framework as Daniel pointed out, I think I should go back to my own implementation [2][3] based on get_free_port. That is my preference because: - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound - adding skip_xsm introduces a component of risk (unless we make it __init maybe?) - using get_free_port is trivial and doesn't pose the same issues Let's find all an agreement on how to move forward on this. [1] https://marc.info/?l=xen-devel&m=164194128922838 [2] https://marc.info/?l=xen-devel&m=164203543615114 [3] https://marc.info/?l=xen-devel&m=164203544615129
On 3/22/22 20:22, Stefano Stabellini wrote: > On Tue, 15 Mar 2022, Daniel P. Smith wrote: >> On 1/28/22 16:33, Stefano Stabellini wrote: >>> From: Luca Miccio <lucmiccio@gmail.com> >>> >>> The xenstore event channel will be allocated for dom0less domains. It is >>> necessary to have access to the evtchn_alloc_unbound function to do >>> that, so make evtchn_alloc_unbound public. >>> >>> Add a skip_xsm parameter to allow disabling the XSM check in >>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call >>> originated from Xen before running any domains.) >>> >>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> >>> CC: Julien Grall <julien@xen.org> >>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> >>> CC: Bertrand Marquis <bertrand.marquis@arm.com> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>> CC: George Dunlap <george.dunlap@citrix.com> >>> CC: Jan Beulich <jbeulich@suse.com> >>> CC: Wei Liu <wl@xen.org> >>> --- >>> Changes v3: >>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter >>> --- >>> xen/common/event_channel.c | 13 ++++++++----- >>> xen/include/xen/event.h | 3 +++ >>> 2 files changed, 11 insertions(+), 5 deletions(-) >>> >>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c >>> index da88ad141a..be57d00a15 100644 >>> --- a/xen/common/event_channel.c >>> +++ b/xen/common/event_channel.c >>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) >>> xsm_evtchn_close_post(chn); >>> } >>> >>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) >>> { >>> struct evtchn *chn; >>> struct domain *d; >>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>> ERROR_EXIT_DOM(port, d); >>> chn = evtchn_from_port(d, port); >>> >>> - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>> - if ( rc ) >>> - goto out; >>> + if ( !skip_xsm ) >>> + { >>> + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>> + if ( rc ) >>> + goto out; >>> + } >> >> Please do not subvert the security framework because it causes an >> inconvenience. As Jan recommended, work within the XSM check to allow >> your access so that we may ensure it is done safely. If you need any >> help making modifications to XSM, please do not hesitate to reach out as >> I will gladly help. > > Thank you! > > First let me reply to Jan: this series is only introducing 1 more call > to evtchn_alloc_unbound, which is to allocate the special xenstore event > channel, the one configured via > d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. > > It is not meant to be a generic function, and it is not meant to be > called more than once. It could (should?) be __init. > > The existing XSM check in evtchn_alloc_unbound cannot work and should > not work: it is based on the current domain having enough privileges to > create the event channel. In this case, we have no current domain at > all. The current domain is Xen itself. I have already replicated this in hyperlaunch for PV construction where I have constructed the event channel for both xenstore and the console. For hyperlaunch the construction is under a single, fairly-tight function where I have promoted the Idle Domain to is_privileged before the creation/construction loop starts and then demote the Idle Domain on the other side of the loop. Honestly this is not my preferred approach but for the initial implementation I do have a moderate amount of confidence regarding the risk that results. My current thinking is that the more appropriate approach would be to introduce a new system domain, Construct Domain??, to provide a separate context under which all the hyperlaunch creation and construction logic would be done and then destroyed as part of init finalization. > For these reasons, given [1], also not to subvert the security > framework as Daniel pointed out, I think I should go back to my own > implementation [2][3] based on get_free_port. That is my preference > because: > > - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound > - adding skip_xsm introduces a component of risk (unless we make it > __init maybe?) > - using get_free_port is trivial and doesn't pose the same issues > > > Let's find all an agreement on how to move forward on this. > > > [1] https://marc.info/?l=xen-devel&m=164194128922838 > [2] https://marc.info/?l=xen-devel&m=164203543615114 > [3] https://marc.info/?l=xen-devel&m=164203544615129
On 23.03.2022 01:22, Stefano Stabellini wrote: > On Tue, 15 Mar 2022, Daniel P. Smith wrote: >> On 1/28/22 16:33, Stefano Stabellini wrote: >>> From: Luca Miccio <lucmiccio@gmail.com> >>> >>> The xenstore event channel will be allocated for dom0less domains. It is >>> necessary to have access to the evtchn_alloc_unbound function to do >>> that, so make evtchn_alloc_unbound public. >>> >>> Add a skip_xsm parameter to allow disabling the XSM check in >>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call >>> originated from Xen before running any domains.) >>> >>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> >>> CC: Julien Grall <julien@xen.org> >>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> >>> CC: Bertrand Marquis <bertrand.marquis@arm.com> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>> CC: George Dunlap <george.dunlap@citrix.com> >>> CC: Jan Beulich <jbeulich@suse.com> >>> CC: Wei Liu <wl@xen.org> >>> --- >>> Changes v3: >>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter >>> --- >>> xen/common/event_channel.c | 13 ++++++++----- >>> xen/include/xen/event.h | 3 +++ >>> 2 files changed, 11 insertions(+), 5 deletions(-) >>> >>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c >>> index da88ad141a..be57d00a15 100644 >>> --- a/xen/common/event_channel.c >>> +++ b/xen/common/event_channel.c >>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) >>> xsm_evtchn_close_post(chn); >>> } >>> >>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) >>> { >>> struct evtchn *chn; >>> struct domain *d; >>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>> ERROR_EXIT_DOM(port, d); >>> chn = evtchn_from_port(d, port); >>> >>> - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>> - if ( rc ) >>> - goto out; >>> + if ( !skip_xsm ) >>> + { >>> + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>> + if ( rc ) >>> + goto out; >>> + } >> >> Please do not subvert the security framework because it causes an >> inconvenience. As Jan recommended, work within the XSM check to allow >> your access so that we may ensure it is done safely. If you need any >> help making modifications to XSM, please do not hesitate to reach out as >> I will gladly help. > > Thank you! > > First let me reply to Jan: this series is only introducing 1 more call > to evtchn_alloc_unbound, which is to allocate the special xenstore event > channel, the one configured via > d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. > > It is not meant to be a generic function, and it is not meant to be > called more than once. It could (should?) be __init. How that? Its pre-existing use doesn't disappear, and requires it to be non-__init. > The existing XSM check in evtchn_alloc_unbound cannot work and should > not work: it is based on the current domain having enough privileges to > create the event channel. In this case, we have no current domain at > all. The current domain is Xen itself. And DOM_XEN cannot be given the appropriate permission, perhaps explicitly when using a real policy and by default in dummy and SILO modes? Jan > For these reasons, given [1], also not to subvert the security > framework as Daniel pointed out, I think I should go back to my own > implementation [2][3] based on get_free_port. That is my preference > because: > > - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound > - adding skip_xsm introduces a component of risk (unless we make it > __init maybe?) > - using get_free_port is trivial and doesn't pose the same issues > > > Let's find all an agreement on how to move forward on this. > > > [1] https://marc.info/?l=xen-devel&m=164194128922838 > [2] https://marc.info/?l=xen-devel&m=164203543615114 > [3] https://marc.info/?l=xen-devel&m=164203544615129 >
On Wed, 23 Mar 2022, Jan Beulich wrote: > On 23.03.2022 01:22, Stefano Stabellini wrote: > > On Tue, 15 Mar 2022, Daniel P. Smith wrote: > >> On 1/28/22 16:33, Stefano Stabellini wrote: > >>> From: Luca Miccio <lucmiccio@gmail.com> > >>> > >>> The xenstore event channel will be allocated for dom0less domains. It is > >>> necessary to have access to the evtchn_alloc_unbound function to do > >>> that, so make evtchn_alloc_unbound public. > >>> > >>> Add a skip_xsm parameter to allow disabling the XSM check in > >>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call > >>> originated from Xen before running any domains.) > >>> > >>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> > >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> > >>> CC: Julien Grall <julien@xen.org> > >>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> > >>> CC: Bertrand Marquis <bertrand.marquis@arm.com> > >>> CC: Andrew Cooper <andrew.cooper3@citrix.com> > >>> CC: George Dunlap <george.dunlap@citrix.com> > >>> CC: Jan Beulich <jbeulich@suse.com> > >>> CC: Wei Liu <wl@xen.org> > >>> --- > >>> Changes v3: > >>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter > >>> --- > >>> xen/common/event_channel.c | 13 ++++++++----- > >>> xen/include/xen/event.h | 3 +++ > >>> 2 files changed, 11 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c > >>> index da88ad141a..be57d00a15 100644 > >>> --- a/xen/common/event_channel.c > >>> +++ b/xen/common/event_channel.c > >>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) > >>> xsm_evtchn_close_post(chn); > >>> } > >>> > >>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > >>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > >>> { > >>> struct evtchn *chn; > >>> struct domain *d; > >>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > >>> ERROR_EXIT_DOM(port, d); > >>> chn = evtchn_from_port(d, port); > >>> > >>> - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > >>> - if ( rc ) > >>> - goto out; > >>> + if ( !skip_xsm ) > >>> + { > >>> + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); > >>> + if ( rc ) > >>> + goto out; > >>> + } > >> > >> Please do not subvert the security framework because it causes an > >> inconvenience. As Jan recommended, work within the XSM check to allow > >> your access so that we may ensure it is done safely. If you need any > >> help making modifications to XSM, please do not hesitate to reach out as > >> I will gladly help. > > > > Thank you! > > > > First let me reply to Jan: this series is only introducing 1 more call > > to evtchn_alloc_unbound, which is to allocate the special xenstore event > > channel, the one configured via > > d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. > > > > It is not meant to be a generic function, and it is not meant to be > > called more than once. It could (should?) be __init. > > How that? Its pre-existing use doesn't disappear, and requires it to be > non-__init. Sorry I meant the new function (calling get_free_port) for the new use-case could be __init. The new function could be added to xen/common/event_channel.c or to xen/arch/arm/domain_build.c. > > The existing XSM check in evtchn_alloc_unbound cannot work and should > > not work: it is based on the current domain having enough privileges to > > create the event channel. In this case, we have no current domain at > > all. The current domain is Xen itself. > > And DOM_XEN cannot be given the appropriate permission, perhaps > explicitly when using a real policy and by default in dummy and SILO > modes? The issue is that the check is based on "current", not one of the domains passed as an argument to evtchn_alloc_unbound. Otherwise, passing DOMID_XEN would be straightforward. We would need to use a hack (like Daniel wrote in the other email) to set the idle_domain temporarily as a privileged domain only for the sake of passing an irrelevant (irrelevant as in "not relevant to this case") XSM check. That cannot be an improvement. Also, setting current to a "fake" domain is not great either. In the specific case of dom0less and this patch, this is the only instance of this issue and could be solved very straightforwardly by calling get_free_port directly as we discussed [1]. I know Julien had some reservations about that. Let's try to find a technical solution that makes everyone happy. Maybe, instead of exporting get_free_port, we could create a new function in xen/common/event_channel.c and mark it as __init? This way: - we don't need to expose get_free_port - the new function would only be __init anyway, so no risk of runtime misuse What do you think? [1] https://marc.info/?l=xen-devel&m=164197327305903
Hi Stefano, On 25/03/2022 00:30, Stefano Stabellini wrote: > On Wed, 23 Mar 2022, Jan Beulich wrote: >> On 23.03.2022 01:22, Stefano Stabellini wrote: >>> On Tue, 15 Mar 2022, Daniel P. Smith wrote: >>>> On 1/28/22 16:33, Stefano Stabellini wrote: >>>>> From: Luca Miccio <lucmiccio@gmail.com> >>>>> >>>>> The xenstore event channel will be allocated for dom0less domains. It is >>>>> necessary to have access to the evtchn_alloc_unbound function to do >>>>> that, so make evtchn_alloc_unbound public. >>>>> >>>>> Add a skip_xsm parameter to allow disabling the XSM check in >>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call >>>>> originated from Xen before running any domains.) >>>>> >>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> >>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> >>>>> CC: Julien Grall <julien@xen.org> >>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> >>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com> >>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>>>> CC: George Dunlap <george.dunlap@citrix.com> >>>>> CC: Jan Beulich <jbeulich@suse.com> >>>>> CC: Wei Liu <wl@xen.org> >>>>> --- >>>>> Changes v3: >>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter >>>>> --- >>>>> xen/common/event_channel.c | 13 ++++++++----- >>>>> xen/include/xen/event.h | 3 +++ >>>>> 2 files changed, 11 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c >>>>> index da88ad141a..be57d00a15 100644 >>>>> --- a/xen/common/event_channel.c >>>>> +++ b/xen/common/event_channel.c >>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) >>>>> xsm_evtchn_close_post(chn); >>>>> } >>>>> >>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) >>>>> { >>>>> struct evtchn *chn; >>>>> struct domain *d; >>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>>>> ERROR_EXIT_DOM(port, d); >>>>> chn = evtchn_from_port(d, port); >>>>> >>>>> - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>>>> - if ( rc ) >>>>> - goto out; >>>>> + if ( !skip_xsm ) >>>>> + { >>>>> + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>>>> + if ( rc ) >>>>> + goto out; >>>>> + } >>>> >>>> Please do not subvert the security framework because it causes an >>>> inconvenience. As Jan recommended, work within the XSM check to allow >>>> your access so that we may ensure it is done safely. If you need any >>>> help making modifications to XSM, please do not hesitate to reach out as >>>> I will gladly help. >>> >>> Thank you! >>> >>> First let me reply to Jan: this series is only introducing 1 more call >>> to evtchn_alloc_unbound, which is to allocate the special xenstore event >>> channel, the one configured via >>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. >>> >>> It is not meant to be a generic function, and it is not meant to be >>> called more than once. It could (should?) be __init. >> >> How that? Its pre-existing use doesn't disappear, and requires it to be >> non-__init. > > Sorry I meant the new function (calling get_free_port) for the new > use-case could be __init. The new function could be added to > xen/common/event_channel.c or to xen/arch/arm/domain_build.c. > > >>> The existing XSM check in evtchn_alloc_unbound cannot work and should >>> not work: it is based on the current domain having enough privileges to >>> create the event channel. In this case, we have no current domain at >>> all. The current domain is Xen itself. >> >> And DOM_XEN cannot be given the appropriate permission, perhaps >> explicitly when using a real policy and by default in dummy and SILO >> modes? > > The issue is that the check is based on "current", not one of the > domains passed as an argument to evtchn_alloc_unbound. Otherwise, > passing DOMID_XEN would be straightforward. > > We would need to use a hack (like Daniel wrote in the other email) to > set the idle_domain temporarily as a privileged domain only for the sake > of passing an irrelevant (irrelevant as in "not relevant to this case") > XSM check. That cannot be an improvement. Also, setting current to a > "fake" domain is not great either. I agree they are not great. But this needs to be compared with the other proposals. AFAIU, your proposal is to duplicate code. This brings other risks such as duplicated bug and more code to maintain. In your case, you only need one helper. But in some other context (e.g. Live-Update and it looks like Hyperlaunch), we may need to re-use more hypercalls code. So to me, the idea of switching to a "fake" domain or bypassing the check is more appealing. I have a preference for the "fake" domain here. > > In the specific case of dom0less and this patch, this is the only > instance of this issue and could be solved very straightforwardly by > calling get_free_port directly as we discussed [1]. Exporting get_free_port() is a no-go for me. This can be easily mis-used and in fact you already did it in your proposal by not using the proper locking (I appreciate this is meant to be boot code only). > > I know Julien had some reservations about that. Let's try to find a > technical solution that makes everyone happy. > > Maybe, instead of exporting get_free_port, we could create a new > function in xen/common/event_channel.c and mark it as __init? This way: > - we don't need to expose get_free_port > - the new function would only be __init anyway, so no risk of runtime > misuse I think the code duplication is short-sighted. I could possibly accept one instance, but I suspect the proposal [1] will end up to add more. So IMHO we should try to resolve it generically now. Cheers, [1] <4836304496e6fbbea41348ed8cc9fcf6b0f3e893.1648049827.git.rahul.singh@arm.com> -- Julien Grall
On Fri, 25 Mar 2022, Julien Grall wrote: > So to me, the idea of switching to a "fake" domain or bypassing the check is > more appealing. I have a preference for the "fake" domain here. As a maintainer, I am not opposed to the "fake"/"contructor" domain concept. It all depends on how many instances of this issue we are going to have. This is the only one on xen-devel so far. I don't think it is worth adding a constructor domain for one instance only. But I agree with you and Daniel that if we end up with several instances, then the constructor domain approach is probably the best option overall. As a contributor, sadly I won't be able to spend a lot of time on this in the following months. If a significant rework is required, I don't think I'll be able to do it, at least not for this Xen release (and it would be nice to have dom0less PV drivers in the coming release.) If Daniel is willing, I could add his "idle_domain is_priv" patch to this series. Not as clean as a proper constructor domain but it would work and it would be simple. It could be evolved into a nicer constructor domain later. This is not my preference but I could do that if Julien and Jan prefer this approach and if Daniel is happy to share his patch. > AFAIU, your proposal is to duplicate code. This brings other risks such as > duplicated bug and more code to maintain. Got it. I'll make one last attempt at a proposal that doesn't involve the fake constructor domain. The goal is to avoid code duplication while providing a safe way forward to make progress with only a small amount of changes. What if we: - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static) - add a skip_xsm parameter to _evtchn_alloc_unbound - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to false (same interface as today's evtchn_alloc_unbound) - introduce an __init early_evtchn_alloc_unbound public function that sets skip_xsm to true This way: - we have a single implementation in _evtchn_alloc_unbound (no code duplication) - the only function exposed that skips the XSM check is __init - evtchn_alloc_unbound continue to have the XSM check same as today E.g.: static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) { ... } static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { return _evtchn_alloc_unbound(alloc, false); } int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) { return _evtchn_alloc_unbound(alloc, true); } Would this be good enough for now?
On 3/25/22 17:05, Stefano Stabellini wrote: > On Fri, 25 Mar 2022, Julien Grall wrote: >> So to me, the idea of switching to a "fake" domain or bypassing the check is >> more appealing. I have a preference for the "fake" domain here. > > As a maintainer, I am not opposed to the "fake"/"contructor" domain > concept. It all depends on how many instances of this issue we are > going to have. This is the only one on xen-devel so far. I don't think > it is worth adding a constructor domain for one instance only. But I > agree with you and Daniel that if we end up with several instances, then > the constructor domain approach is probably the best option overall. > > > As a contributor, sadly I won't be able to spend a lot of time on this > in the following months. If a significant rework is required, I don't > think I'll be able to do it, at least not for this Xen release (and it > would be nice to have dom0less PV drivers in the coming release.) If > Daniel is willing, I could add his "idle_domain is_priv" patch to this > series. Not as clean as a proper constructor domain but it would work > and it would be simple. It could be evolved into a nicer constructor > domain later. > > This is not my preference but I could do that if Julien and Jan prefer > this approach and if Daniel is happy to share his patch. > > >> AFAIU, your proposal is to duplicate code. This brings other risks such as >> duplicated bug and more code to maintain. > > Got it. I'll make one last attempt at a proposal that doesn't involve > the fake constructor domain. The goal is to avoid code duplication while > providing a safe way forward to make progress with only a small amount > of changes. What if we: > > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static) > - add a skip_xsm parameter to _evtchn_alloc_unbound > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to > false (same interface as today's evtchn_alloc_unbound) > - introduce an __init early_evtchn_alloc_unbound public function that > sets skip_xsm to true > > This way: > - we have a single implementation in _evtchn_alloc_unbound (no code > duplication) > - the only function exposed that skips the XSM check is __init > - evtchn_alloc_unbound continue to have the XSM check same as today > > > E.g.: > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > { > ... > } > > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, false); > } > > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, true); > } > > > Would this be good enough for now? Please see the RFC patch I just posted[1], IMHO I think this is a safer approach for this specific instance. [1] https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t v/r, dps
On Mon, 28 Mar 2022, Daniel P. Smith wrote: > On 3/25/22 17:05, Stefano Stabellini wrote: > > On Fri, 25 Mar 2022, Julien Grall wrote: > >> So to me, the idea of switching to a "fake" domain or bypassing the check is > >> more appealing. I have a preference for the "fake" domain here. > > > > As a maintainer, I am not opposed to the "fake"/"contructor" domain > > concept. It all depends on how many instances of this issue we are > > going to have. This is the only one on xen-devel so far. I don't think > > it is worth adding a constructor domain for one instance only. But I > > agree with you and Daniel that if we end up with several instances, then > > the constructor domain approach is probably the best option overall. > > > > > > As a contributor, sadly I won't be able to spend a lot of time on this > > in the following months. If a significant rework is required, I don't > > think I'll be able to do it, at least not for this Xen release (and it > > would be nice to have dom0less PV drivers in the coming release.) If > > Daniel is willing, I could add his "idle_domain is_priv" patch to this > > series. Not as clean as a proper constructor domain but it would work > > and it would be simple. It could be evolved into a nicer constructor > > domain later. > > > > This is not my preference but I could do that if Julien and Jan prefer > > this approach and if Daniel is happy to share his patch. > > > > > >> AFAIU, your proposal is to duplicate code. This brings other risks such as > >> duplicated bug and more code to maintain. > > > > Got it. I'll make one last attempt at a proposal that doesn't involve > > the fake constructor domain. The goal is to avoid code duplication while > > providing a safe way forward to make progress with only a small amount > > of changes. What if we: > > > > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static) > > - add a skip_xsm parameter to _evtchn_alloc_unbound > > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to > > false (same interface as today's evtchn_alloc_unbound) > > - introduce an __init early_evtchn_alloc_unbound public function that > > sets skip_xsm to true > > > > This way: > > - we have a single implementation in _evtchn_alloc_unbound (no code > > duplication) > > - the only function exposed that skips the XSM check is __init > > - evtchn_alloc_unbound continue to have the XSM check same as today > > > > > > E.g.: > > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > > { > > ... > > } > > > > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > > { > > return _evtchn_alloc_unbound(alloc, false); > > } > > > > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > > { > > return _evtchn_alloc_unbound(alloc, true); > > } > > > > > > Would this be good enough for now? > > Please see the RFC patch I just posted[1], IMHO I think this is a safer > approach for this specific instance. > > [1] > https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t I read it, the patch looks fine. I also tested it together with my series and it solves the problem. With [1], it is just a matter of making evtchn_alloc_unbound as is non-static. If the other maintainers also agree with [1], then I'll just rebase on it and limit my changes to exporting evtchn_alloc_unbound.
On 3/25/22 17:05, Stefano Stabellini wrote: > On Fri, 25 Mar 2022, Julien Grall wrote: >> So to me, the idea of switching to a "fake" domain or bypassing the check is >> more appealing. I have a preference for the "fake" domain here. > > As a maintainer, I am not opposed to the "fake"/"contructor" domain > concept. It all depends on how many instances of this issue we are > going to have. This is the only one on xen-devel so far. I don't think > it is worth adding a constructor domain for one instance only. But I > agree with you and Daniel that if we end up with several instances, then > the constructor domain approach is probably the best option overall. The constructor domain still needs more discussion and would likely be part of a larger approach that will require buy-in from several maintainers and should be looking to solve a more general problem internal access control of which domain construction within the hypervisor is just one case. For this I would be glad to start a working group, for which the start of can add to the next community call agenda. > As a contributor, sadly I won't be able to spend a lot of time on this > in the following months. If a significant rework is required, I don't > think I'll be able to do it, at least not for this Xen release (and it > would be nice to have dom0less PV drivers in the coming release.) If > Daniel is willing, I could add his "idle_domain is_priv" patch to this > series. Not as clean as a proper constructor domain but it would work > and it would be simple. It could be evolved into a nicer constructor > domain later. > > This is not my preference but I could do that if Julien and Jan prefer > this approach and if Daniel is happy to share his patch. I can look to spin out a general version of what I am doing, likely exposed as an XSM call so it can be handled appropriately across policies. >> AFAIU, your proposal is to duplicate code. This brings other risks such as >> duplicated bug and more code to maintain. > > Got it. I'll make one last attempt at a proposal that doesn't involve > the fake constructor domain. The goal is to avoid code duplication while > providing a safe way forward to make progress with only a small amount > of changes. What if we: > > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static) > - add a skip_xsm parameter to _evtchn_alloc_unbound > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to > false (same interface as today's evtchn_alloc_unbound) > - introduce an __init early_evtchn_alloc_unbound public function that > sets skip_xsm to true > > This way: > - we have a single implementation in _evtchn_alloc_unbound (no code > duplication) > - the only function exposed that skips the XSM check is __init > - evtchn_alloc_unbound continue to have the XSM check same as today > > > E.g.: > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > { > ... > } > > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, false); > } > > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, true); > } > > > Would this be good enough for now? I guest the question is if it is okay for this to exist until the new XSM calls are found to be acceptable and then this is reverted/changed to the XSM calls? v/r dps
Hi Stefano, On 25/03/2022 21:05, Stefano Stabellini wrote: > On Fri, 25 Mar 2022, Julien Grall wrote: > As a contributor, sadly I won't be able to spend a lot of time on this > in the following months. If a significant rework is required, I don't > think I'll be able to do it, at least not for this Xen release (and it > would be nice to have dom0less PV drivers in the coming release.) If > Daniel is willing, I could add his "idle_domain is_priv" patch to this > series. Not as clean as a proper constructor domain but it would work > and it would be simple. It could be evolved into a nicer constructor > domain later. > > This is not my preference but I could do that if Julien and Jan prefer > this approach and if Daniel is happy to share his patch. This is still my preference because we are avoiding to push the problem to the unlucky person that will need to introduce another (or multiple) 'skip_xsm'. >> AFAIU, your proposal is to duplicate code. This brings other risks such as >> duplicated bug and more code to maintain. > > Got it. I'll make one last attempt at a proposal that doesn't involve > the fake constructor domain. The goal is to avoid code duplication while > providing a safe way forward to make progress with only a small amount > of changes. What if we: > > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static) > - add a skip_xsm parameter to _evtchn_alloc_unbound > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to > false (same interface as today's evtchn_alloc_unbound) > - introduce an __init early_evtchn_alloc_unbound public function that > sets skip_xsm to true > > This way: > - we have a single implementation in _evtchn_alloc_unbound (no code > duplication) > - the only function exposed that skips the XSM check is __init > - evtchn_alloc_unbound continue to have the XSM check same as today > > > E.g.: > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) > { > ... > } > > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, false); > } > > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) > { > return _evtchn_alloc_unbound(alloc, true); > } > > > Would this be good enough for now? I would be OK with that. Note that, I think we need to protect the use of skip_xsm with evaluate_nospec(). Cheers, -- Julien Grall
On 3/24/22 20:30, Stefano Stabellini wrote: > On Wed, 23 Mar 2022, Jan Beulich wrote: >> On 23.03.2022 01:22, Stefano Stabellini wrote: >>> On Tue, 15 Mar 2022, Daniel P. Smith wrote: >>>> On 1/28/22 16:33, Stefano Stabellini wrote: >>>>> From: Luca Miccio <lucmiccio@gmail.com> >>>>> >>>>> The xenstore event channel will be allocated for dom0less domains. It is >>>>> necessary to have access to the evtchn_alloc_unbound function to do >>>>> that, so make evtchn_alloc_unbound public. >>>>> >>>>> Add a skip_xsm parameter to allow disabling the XSM check in >>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call >>>>> originated from Xen before running any domains.) >>>>> >>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> >>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> >>>>> CC: Julien Grall <julien@xen.org> >>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> >>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com> >>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>>>> CC: George Dunlap <george.dunlap@citrix.com> >>>>> CC: Jan Beulich <jbeulich@suse.com> >>>>> CC: Wei Liu <wl@xen.org> >>>>> --- >>>>> Changes v3: >>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter >>>>> --- >>>>> xen/common/event_channel.c | 13 ++++++++----- >>>>> xen/include/xen/event.h | 3 +++ >>>>> 2 files changed, 11 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c >>>>> index da88ad141a..be57d00a15 100644 >>>>> --- a/xen/common/event_channel.c >>>>> +++ b/xen/common/event_channel.c >>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn) >>>>> xsm_evtchn_close_post(chn); >>>>> } >>>>> >>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm) >>>>> { >>>>> struct evtchn *chn; >>>>> struct domain *d; >>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) >>>>> ERROR_EXIT_DOM(port, d); >>>>> chn = evtchn_from_port(d, port); >>>>> >>>>> - rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>>>> - if ( rc ) >>>>> - goto out; >>>>> + if ( !skip_xsm ) >>>>> + { >>>>> + rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); >>>>> + if ( rc ) >>>>> + goto out; >>>>> + } >>>> >>>> Please do not subvert the security framework because it causes an >>>> inconvenience. As Jan recommended, work within the XSM check to allow >>>> your access so that we may ensure it is done safely. If you need any >>>> help making modifications to XSM, please do not hesitate to reach out as >>>> I will gladly help. >>> >>> Thank you! >>> >>> First let me reply to Jan: this series is only introducing 1 more call >>> to evtchn_alloc_unbound, which is to allocate the special xenstore event >>> channel, the one configured via >>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN]. >>> >>> It is not meant to be a generic function, and it is not meant to be >>> called more than once. It could (should?) be __init. >> >> How that? Its pre-existing use doesn't disappear, and requires it to be >> non-__init. > > Sorry I meant the new function (calling get_free_port) for the new > use-case could be __init. The new function could be added to > xen/common/event_channel.c or to xen/arch/arm/domain_build.c. > > >>> The existing XSM check in evtchn_alloc_unbound cannot work and should >>> not work: it is based on the current domain having enough privileges to >>> create the event channel. In this case, we have no current domain at >>> all. The current domain is Xen itself. >> >> And DOM_XEN cannot be given the appropriate permission, perhaps >> explicitly when using a real policy and by default in dummy and SILO >> modes? > > The issue is that the check is based on "current", not one of the > domains passed as an argument to evtchn_alloc_unbound. Otherwise, > passing DOMID_XEN would be straightforward. > > We would need to use a hack (like Daniel wrote in the other email) to > set the idle_domain temporarily as a privileged domain only for the sake > of passing an irrelevant (irrelevant as in "not relevant to this case") > XSM check. That cannot be an improvement. Also, setting current to a > "fake" domain is not great either. My suggestion was not to intended to be simply a hack but looking at the larger issue instead of simply doing a targeted fix for this one instnace. While I cannot give an example right off hand, the reality is, at least for hyperlaunch, that we cannot say for certain there will not be further resource allocations that is protected by the security framework and will require preliminary handling by the construction logic in the hypervisor. The low-complexity approach is to address each one in a case-by-case fashion using direct calls that go around the security framework. A more security conscience, and higher complexity, approach would be to consider a least-privilege approach and look at introducing the ability to do controlled switching of contexts, i.e. moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is granted only the necessary privileges to do the resource allocations in which it is limited. This is also not the first time this issue has come up, I don't recall the exact thread but several months ago someone ran into the issue they need to make a call to a resource function and was blocked by XSM because DOMID_IDLE has no privileges. The reality is that the idea of monolithic high-privileged entities is being dropped in favor of least-privilege, and where possible hardware enforced, constraint. This can be seen with Intel de-privileging SMM and running SMI handlers in constrained ring 3. Arm is gaining capability pointers, CHERI, that will enable the possibility for constrained, least-privileged kernel subsystems. Would it not be advantageous for Xen to start moving in such a direction that would enable it to provide a new level of safety and security for consumers of Xen? Coming back to the short-term, I would advocate for introducing the concept and abstraction of constrained context switching through a set of function calls, which would likely be under XSM to allow policy enforcement. Likely the introductory implementation would just mask the fact that it is just setting `is_privileged` for DOMID_IDLE. Future evolution of the capability could see the introduction of new "contexts", whether they are represented by a domain could be determined then, and the ability to do controlled switching based on policy. Just wanted to put more of my thinking out there before a path is taken. > In the specific case of dom0less and this patch, this is the only > instance of this issue and could be solved very straightforwardly by > calling get_free_port directly as we discussed [1]. > > I know Julien had some reservations about that. Let's try to find a > technical solution that makes everyone happy. > > Maybe, instead of exporting get_free_port, we could create a new > function in xen/common/event_channel.c and mark it as __init? This way: > - we don't need to expose get_free_port > - the new function would only be __init anyway, so no risk of runtime > misuse > > What do you think? > > [1] https://marc.info/?l=xen-devel&m=164197327305903
Hi Daniel, On 25/03/2022 15:45, Daniel P. Smith wrote: >>>> The existing XSM check in evtchn_alloc_unbound cannot work and should >>>> not work: it is based on the current domain having enough privileges to >>>> create the event channel. In this case, we have no current domain at >>>> all. The current domain is Xen itself. >>> >>> And DOM_XEN cannot be given the appropriate permission, perhaps >>> explicitly when using a real policy and by default in dummy and SILO >>> modes? >> >> The issue is that the check is based on "current", not one of the >> domains passed as an argument to evtchn_alloc_unbound. Otherwise, >> passing DOMID_XEN would be straightforward. >> >> We would need to use a hack (like Daniel wrote in the other email) to >> set the idle_domain temporarily as a privileged domain only for the sake >> of passing an irrelevant (irrelevant as in "not relevant to this case") >> XSM check. That cannot be an improvement. Also, setting current to a >> "fake" domain is not great either. > > My suggestion was not to intended to be simply a hack but looking at the > larger issue instead of simply doing a targeted fix for this one > instnace. While I cannot give an example right off hand, the reality is, > at least for hyperlaunch, that we cannot say for certain there will not > be further resource allocations that is protected by the security > framework and will require preliminary handling by the construction > logic in the hypervisor. The low-complexity approach is to address each > one in a case-by-case fashion using direct calls that go around the > security framework. A more security conscience, and higher complexity, > approach would be to consider a least-privilege approach and look at > introducing the ability to do controlled switching of contexts, i.e. > moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is > granted only the necessary privileges to do the resource allocations in > which it is limited. > > This is also not the first time this issue has come up, I don't recall > the exact thread but several months ago someone ran into the issue they > need to make a call to a resource function and was blocked by XSM > because DOMID_IDLE has no privileges. This was in the context of Live-Updating Xen. We are trying to re-use as much as possible the code used by the hypercalls. Those functions may contain xsm check that would fail because current would be an idle vCPU. For the full context: https://lore.kernel.org/xen-devel/bfd645cf42ef7786183be15c222ad04beed362c0.camel@xen.org/ > The reality is that the idea of > monolithic high-privileged entities is being dropped in favor of > least-privilege, and where possible hardware enforced, constraint. This > can be seen with Intel de-privileging SMM and running SMI handlers in > constrained ring 3. Arm is gaining capability pointers, CHERI, that will > enable the possibility for constrained, least-privileged kernel > subsystems. Would it not be advantageous for Xen to start moving in such > a direction that would enable it to provide a new level of safety and > security for consumers of Xen? > > Coming back to the short-term, I would advocate for introducing the > concept and abstraction of constrained context switching through a set > of function calls, which would likely be under XSM to allow policy > enforcement. Likely the introductory implementation would just mask the > fact that it is just setting `is_privileged` for DOMID_IDLE. Future > evolution of the capability could see the introduction of new > "contexts", whether they are represented by a domain could be determined > then, and the ability to do controlled switching based on policy. +1 with this idea. Cheers, -- Julien Grall
On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote: > > On 3/24/22 20:30, Stefano Stabellini wrote: > > On Wed, 23 Mar 2022, Jan Beulich wrote: > >> On 23.03.2022 01:22, Stefano Stabellini wrote: > >>> The existing XSM check in evtchn_alloc_unbound cannot work and should > >>> not work: it is based on the current domain having enough privileges to > >>> create the event channel. In this case, we have no current domain at > >>> all. The current domain is Xen itself. > >> > >> And DOM_XEN cannot be given the appropriate permission, perhaps > >> explicitly when using a real policy and by default in dummy and SILO > >> modes? > > > > The issue is that the check is based on "current", not one of the > > domains passed as an argument to evtchn_alloc_unbound. Otherwise, > > passing DOMID_XEN would be straightforward. > > > > We would need to use a hack (like Daniel wrote in the other email) to > > set the idle_domain temporarily as a privileged domain only for the sake > > of passing an irrelevant (irrelevant as in "not relevant to this case") > > XSM check. That cannot be an improvement. Also, setting current to a > > "fake" domain is not great either. > > My suggestion was not to intended to be simply a hack but looking at the > larger issue instead of simply doing a targeted fix for this one > instnace. While I cannot give an example right off hand, the reality is, > at least for hyperlaunch, that we cannot say for certain there will not > be further resource allocations that is protected by the security > framework and will require preliminary handling by the construction > logic in the hypervisor. The low-complexity approach is to address each > one in a case-by-case fashion using direct calls that go around the > security framework. A more security conscience, and higher complexity, > approach would be to consider a least-privilege approach and look at > introducing the ability to do controlled switching of contexts, i.e. > moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is > granted only the necessary privileges to do the resource allocations in > which it is limited. > > This is also not the first time this issue has come up, I don't recall > the exact thread but several months ago someone ran into the issue they > need to make a call to a resource function and was blocked by XSM > because DOMID_IDLE has no privileges. The reality is that the idea of > monolithic high-privileged entities is being dropped in favor of > least-privilege, and where possible hardware enforced, constraint. This > can be seen with Intel de-privileging SMM and running SMI handlers in > constrained ring 3. Arm is gaining capability pointers, CHERI, that will > enable the possibility for constrained, least-privileged kernel > subsystems. Would it not be advantageous for Xen to start moving in such > a direction that would enable it to provide a new level of safety and > security for consumers of Xen? > > Coming back to the short-term, I would advocate for introducing the > concept and abstraction of constrained context switching through a set > of function calls, which would likely be under XSM to allow policy > enforcement. Likely the introductory implementation would just mask the > fact that it is just setting `is_privileged` for DOMID_IDLE. Future > evolution of the capability could see the introduction of new > "contexts", whether they are represented by a domain could be determined > then, and the ability to do controlled switching based on policy. For the specific case of evtchn_alloc_unbound, Flask's xsm_evtchn_unbound has a side effect of labeling the event channel. So skipping the hook will have unintended consequences for Flask. xsm_evtchn_unbound could be split in two to have an access piece and a labeling piece. The access piece is run at hypercall entry, and the labeling is still done in evtchn_alloc_unbound. For Flask, labeling depends on the two domain endpoints, but not current. More generally, it seems to me there are too many xsm checks in the middle of functions/operations. They are fine for a normal entry via hypercall, but they interfere with Xen's internal operations. Xen shouldn't be restricted in its own operations. The live update people hit it with domain creation, and I just posted a patch for unmap_domain_pirq. It would be more obvious for auditing if each hypercall entrypoint applied xsm checks. Make the allow/deny decision as early as possible. Then a worker function would be easily callable for the Xen-internal case. The flip side of that is the xsm hook may need sub-op specific data to make its decision, so it fits to put it in the sub-op function. It seems to me the location of hooks was determined by where the data they need is already available. Re-arranging hooks may require some duplication. The xsm controls should clearly apply to the DomUs and other entities managed by Xen. xsm restricting Xen itself seems wrong. Having internal operations get denied by xsm may unintentionally subvert Xen itself. Yes, moving checks outward makes for an un-restricted middle. But the security server running in the same address space isn't going to help against code exec inside the hypervisor. I think I view it like this: Why is a xen internal operation subject to a security check? When would one ever want to deny a xen internal operation? Regards, Jason
On 3/25/22 12:52, Jason Andryuk wrote: > On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote: >> >> On 3/24/22 20:30, Stefano Stabellini wrote: >>> On Wed, 23 Mar 2022, Jan Beulich wrote: >>>> On 23.03.2022 01:22, Stefano Stabellini wrote: >>>>> The existing XSM check in evtchn_alloc_unbound cannot work and should >>>>> not work: it is based on the current domain having enough privileges to >>>>> create the event channel. In this case, we have no current domain at >>>>> all. The current domain is Xen itself. >>>> >>>> And DOM_XEN cannot be given the appropriate permission, perhaps >>>> explicitly when using a real policy and by default in dummy and SILO >>>> modes? >>> >>> The issue is that the check is based on "current", not one of the >>> domains passed as an argument to evtchn_alloc_unbound. Otherwise, >>> passing DOMID_XEN would be straightforward. >>> >>> We would need to use a hack (like Daniel wrote in the other email) to >>> set the idle_domain temporarily as a privileged domain only for the sake >>> of passing an irrelevant (irrelevant as in "not relevant to this case") >>> XSM check. That cannot be an improvement. Also, setting current to a >>> "fake" domain is not great either. >> >> My suggestion was not to intended to be simply a hack but looking at the >> larger issue instead of simply doing a targeted fix for this one >> instnace. While I cannot give an example right off hand, the reality is, >> at least for hyperlaunch, that we cannot say for certain there will not >> be further resource allocations that is protected by the security >> framework and will require preliminary handling by the construction >> logic in the hypervisor. The low-complexity approach is to address each >> one in a case-by-case fashion using direct calls that go around the >> security framework. A more security conscience, and higher complexity, >> approach would be to consider a least-privilege approach and look at >> introducing the ability to do controlled switching of contexts, i.e. >> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is >> granted only the necessary privileges to do the resource allocations in >> which it is limited. >> >> This is also not the first time this issue has come up, I don't recall >> the exact thread but several months ago someone ran into the issue they >> need to make a call to a resource function and was blocked by XSM >> because DOMID_IDLE has no privileges. The reality is that the idea of >> monolithic high-privileged entities is being dropped in favor of >> least-privilege, and where possible hardware enforced, constraint. This >> can be seen with Intel de-privileging SMM and running SMI handlers in >> constrained ring 3. Arm is gaining capability pointers, CHERI, that will >> enable the possibility for constrained, least-privileged kernel >> subsystems. Would it not be advantageous for Xen to start moving in such >> a direction that would enable it to provide a new level of safety and >> security for consumers of Xen? >> >> Coming back to the short-term, I would advocate for introducing the >> concept and abstraction of constrained context switching through a set >> of function calls, which would likely be under XSM to allow policy >> enforcement. Likely the introductory implementation would just mask the >> fact that it is just setting `is_privileged` for DOMID_IDLE. Future >> evolution of the capability could see the introduction of new >> "contexts", whether they are represented by a domain could be determined >> then, and the ability to do controlled switching based on policy. > > For the specific case of evtchn_alloc_unbound, Flask's > xsm_evtchn_unbound has a side effect of labeling the event channel. > So skipping the hook will have unintended consequences for Flask. > > xsm_evtchn_unbound could be split in two to have an access piece and a > labeling piece. The access piece is run at hypercall entry, and the > labeling is still done in evtchn_alloc_unbound. For Flask, labeling > depends on the two domain endpoints, but not current. > > More generally, it seems to me there are too many xsm checks in the > middle of functions/operations. They are fine for a normal entry via > hypercall, but they interfere with Xen's internal operations. Xen > shouldn't be restricted in its own operations. The live update people > hit it with domain creation, and I just posted a patch for > unmap_domain_pirq. > > It would be more obvious for auditing if each hypercall entrypoint > applied xsm checks. Make the allow/deny decision as early as > possible. Then a worker function would be easily callable for the > Xen-internal case. The flip side of that is the xsm hook may need > sub-op specific data to make its decision, so it fits to put it in the > sub-op function. It seems to me the location of hooks was determined > by where the data they need is already available. Re-arranging hooks > may require some duplication. While it can be inconvenient in some cases, one of the purposes of flask is to provide fine(er) grained access control over resources and for Xen the path to a resource is often through a multiplexed interface. As such I would be very cautious in considering the addition of distance between the time of check and the time of access to a resource. It has been some time since it has occurred but I am aware that there was an audit conducted to ensure all necessary resources were covered and the placement of decision points were in the correct locations. With that said, the code has evolved and it may be time for another audit (though it would take some cycles to do such an audit). > The xsm controls should clearly apply to the DomUs and other entities > managed by Xen. xsm restricting Xen itself seems wrong. Having > internal operations get denied by xsm may unintentionally subvert Xen > itself. > > Yes, moving checks outward makes for an un-restricted middle. But the > security server running in the same address space isn't going to help > against code exec inside the hypervisor. > > I think I view it like this: Why is a xen internal operation subject > to a security check? When would one ever want to deny a xen internal > operation? In the past this was the standard of thinking but the reality of the last few years has demonstrated this approach can no longer be relied upon. If this approach was safe, then Intel would have had no reason to put the work into providing the necessary mechanisms to de-privilege SMI handlers, just to pick a recent example. Hardware changes are coming, and in the case of OpenPOWER and Arm it is already here, that will enable kernels and hypervisors to de-privilege sections of their processing logic depending on the risk, for example ensuring code processing input from a hypercall cannot reach memory used by the security server. My point here is that instead of cementing a historical approach into the code base, that perhaps it would be worth considering using a more flexible approach that provides the necessary modularity to enable the introduction of this kind of capability, even if today it is merely a facade, without having to rework the interfaces to remove all the secondary direct paths. V/r, dps
On 25.03.2022 01:30, Stefano Stabellini wrote: > Maybe, instead of exporting get_free_port, we could create a new > function in xen/common/event_channel.c and mark it as __init? This way: > - we don't need to expose get_free_port > - the new function would only be __init anyway, so no risk of runtime > misuse > > What do you think? Maybe. Such a function would want to serve both your an Daniel's purpose then. Jan
© 2016 - 2024 Red Hat, Inc.