[PATCH v3 2/5] xen: make evtchn_alloc_unbound public

Stefano Stabellini posted 5 patches 2 years, 10 months ago
There is a newer version of this series
[PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Stefano Stabellini 2 years, 10 months ago
From: Luca Miccio <lucmiccio@gmail.com>

The xenstore event channel will be allocated for dom0less domains. It is
necessary to have access to the evtchn_alloc_unbound function to do
that, so make evtchn_alloc_unbound public.

Add a skip_xsm parameter to allow disabling the XSM check in
evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
originated from Xen before running any domains.)

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Wei Liu <wl@xen.org>
---
Changes v3:
- expose evtchn_alloc_unbound, assing a skip_xsm parameter
---
 xen/common/event_channel.c | 13 ++++++++-----
 xen/include/xen/event.h    |  3 +++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index da88ad141a..be57d00a15 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
     xsm_evtchn_close_post(chn);
 }
 
-static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
 {
     struct evtchn *chn;
     struct domain *d;
@@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
         ERROR_EXIT_DOM(port, d);
     chn = evtchn_from_port(d, port);
 
-    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
-    if ( rc )
-        goto out;
+    if ( !skip_xsm )
+    {
+        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
+        if ( rc )
+            goto out;
+    }
 
     evtchn_write_lock(chn);
 
@@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         struct evtchn_alloc_unbound alloc_unbound;
         if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 )
             return -EFAULT;
-        rc = evtchn_alloc_unbound(&alloc_unbound);
+        rc = evtchn_alloc_unbound(&alloc_unbound, false);
         if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) )
             rc = -EFAULT; /* Cleaning up here would be a mess! */
         break;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 21c95e14fd..0a2cdedf7d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
 /* Free an event channel. */
 void evtchn_free(struct domain *d, struct evtchn *chn);
 
+/* Create a new event channel port */
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
+
 /* Allocate a specific event channel port. */
 int evtchn_allocate_port(struct domain *d, unsigned int port);
 
-- 
2.25.1


Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Jan Beulich 2 years, 10 months ago
On 28.01.2022 22:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The xenstore event channel will be allocated for dom0less domains. It is
> necessary to have access to the evtchn_alloc_unbound function to do
> that, so make evtchn_alloc_unbound public.
> 
> Add a skip_xsm parameter to allow disabling the XSM check in
> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> originated from Xen before running any domains.)

What I continue to be missing here are some words on why it is okay to
skip XSM checking in this case. After all an alternative would be to
put in place a suitable domain as "current" one for the check to actually
work.

Jan


Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 1/28/22 16:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The xenstore event channel will be allocated for dom0less domains. It is
> necessary to have access to the evtchn_alloc_unbound function to do
> that, so make evtchn_alloc_unbound public.
> 
> Add a skip_xsm parameter to allow disabling the XSM check in
> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> originated from Xen before running any domains.)
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <george.dunlap@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Wei Liu <wl@xen.org>
> ---
> Changes v3:
> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> ---
>  xen/common/event_channel.c | 13 ++++++++-----
>  xen/include/xen/event.h    |  3 +++
>  2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index da88ad141a..be57d00a15 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>      xsm_evtchn_close_post(chn);
>  }
>  
> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>  {
>      struct evtchn *chn;
>      struct domain *d;
> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>          ERROR_EXIT_DOM(port, d);
>      chn = evtchn_from_port(d, port);
>  
> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> -    if ( rc )
> -        goto out;
> +    if ( !skip_xsm )
> +    {
> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> +        if ( rc )
> +            goto out;
> +    }

Please do not subvert the security framework because it causes an
inconvenience. As Jan recommended, work within the XSM check to allow
your access so that we may ensure it is done safely. If you need any
help making modifications to XSM, please do not hesitate to reach out as
I will gladly help.

>      evtchn_write_lock(chn);
>  
> @@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          struct evtchn_alloc_unbound alloc_unbound;
>          if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 )
>              return -EFAULT;
> -        rc = evtchn_alloc_unbound(&alloc_unbound);
> +        rc = evtchn_alloc_unbound(&alloc_unbound, false);
>          if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) )
>              rc = -EFAULT; /* Cleaning up here would be a mess! */
>          break;
> diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
> index 21c95e14fd..0a2cdedf7d 100644
> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
>  /* Free an event channel. */
>  void evtchn_free(struct domain *d, struct evtchn *chn);
>  
> +/* Create a new event channel port */
> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
> +
>  /* Allocate a specific event channel port. */
>  int evtchn_allocate_port(struct domain *d, unsigned int port);
>
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Stefano Stabellini 2 years, 8 months ago
On Tue, 15 Mar 2022, Daniel P. Smith wrote:
> On 1/28/22 16:33, Stefano Stabellini wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> > 
> > The xenstore event channel will be allocated for dom0less domains. It is
> > necessary to have access to the evtchn_alloc_unbound function to do
> > that, so make evtchn_alloc_unbound public.
> > 
> > Add a skip_xsm parameter to allow disabling the XSM check in
> > evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> > originated from Xen before running any domains.)
> > 
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > CC: Julien Grall <julien@xen.org>
> > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> > CC: Bertrand Marquis <bertrand.marquis@arm.com>
> > CC: Andrew Cooper <andrew.cooper3@citrix.com>
> > CC: George Dunlap <george.dunlap@citrix.com>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Wei Liu <wl@xen.org>
> > ---
> > Changes v3:
> > - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> > ---
> >  xen/common/event_channel.c | 13 ++++++++-----
> >  xen/include/xen/event.h    |  3 +++
> >  2 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> > index da88ad141a..be57d00a15 100644
> > --- a/xen/common/event_channel.c
> > +++ b/xen/common/event_channel.c
> > @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
> >      xsm_evtchn_close_post(chn);
> >  }
> >  
> > -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> >  {
> >      struct evtchn *chn;
> >      struct domain *d;
> > @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >          ERROR_EXIT_DOM(port, d);
> >      chn = evtchn_from_port(d, port);
> >  
> > -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> > -    if ( rc )
> > -        goto out;
> > +    if ( !skip_xsm )
> > +    {
> > +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> > +        if ( rc )
> > +            goto out;
> > +    }
> 
> Please do not subvert the security framework because it causes an
> inconvenience. As Jan recommended, work within the XSM check to allow
> your access so that we may ensure it is done safely. If you need any
> help making modifications to XSM, please do not hesitate to reach out as
> I will gladly help.

Thank you!

First let me reply to Jan: this series is only introducing 1 more call
to evtchn_alloc_unbound, which is to allocate the special xenstore event
channel, the one configured via
d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].

It is not meant to be a generic function, and it is not meant to be
called more than once. It could (should?) be __init.

The existing XSM check in evtchn_alloc_unbound cannot work and should
not work: it is based on the current domain having enough privileges to
create the event channel. In this case, we have no current domain at
all. The current domain is Xen itself.

For these reasons, given [1], also not to subvert the security
framework as Daniel pointed out, I think I should go back to my own
implementation [2][3] based on get_free_port. That is my preference
because:

- the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
- adding skip_xsm introduces a component of risk (unless we make it
  __init maybe?)
- using get_free_port is trivial and doesn't pose the same issues


Let's find all an agreement on how to move forward on this.


[1] https://marc.info/?l=xen-devel&m=164194128922838
[2] https://marc.info/?l=xen-devel&m=164203543615114
[3] https://marc.info/?l=xen-devel&m=164203544615129
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 3/22/22 20:22, Stefano Stabellini wrote:
> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>
>>> The xenstore event channel will be allocated for dom0less domains. It is
>>> necessary to have access to the evtchn_alloc_unbound function to do
>>> that, so make evtchn_alloc_unbound public.
>>>
>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>> originated from Xen before running any domains.)
>>>
>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> CC: Julien Grall <julien@xen.org>
>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: George Dunlap <george.dunlap@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Wei Liu <wl@xen.org>
>>> ---
>>> Changes v3:
>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>> ---
>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>  xen/include/xen/event.h    |  3 +++
>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>> index da88ad141a..be57d00a15 100644
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>      xsm_evtchn_close_post(chn);
>>>  }
>>>  
>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>  {
>>>      struct evtchn *chn;
>>>      struct domain *d;
>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>          ERROR_EXIT_DOM(port, d);
>>>      chn = evtchn_from_port(d, port);
>>>  
>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> -    if ( rc )
>>> -        goto out;
>>> +    if ( !skip_xsm )
>>> +    {
>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> +        if ( rc )
>>> +            goto out;
>>> +    }
>>
>> Please do not subvert the security framework because it causes an
>> inconvenience. As Jan recommended, work within the XSM check to allow
>> your access so that we may ensure it is done safely. If you need any
>> help making modifications to XSM, please do not hesitate to reach out as
>> I will gladly help.
> 
> Thank you!
> 
> First let me reply to Jan: this series is only introducing 1 more call
> to evtchn_alloc_unbound, which is to allocate the special xenstore event
> channel, the one configured via
> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> 
> It is not meant to be a generic function, and it is not meant to be
> called more than once. It could (should?) be __init.
> 
> The existing XSM check in evtchn_alloc_unbound cannot work and should
> not work: it is based on the current domain having enough privileges to
> create the event channel. In this case, we have no current domain at
> all. The current domain is Xen itself.

I have already replicated this in hyperlaunch for PV construction where
I have constructed the event channel for both xenstore and the console.
For hyperlaunch the construction is under a single, fairly-tight
function where I have promoted the Idle Domain to is_privileged before
the creation/construction loop starts and then demote the Idle Domain on
the other side of the loop. Honestly this is not my preferred approach
but for the initial implementation I do have a moderate amount of
confidence regarding the risk that results. My current thinking is that
the more appropriate approach would be to introduce a new system domain,
Construct Domain??, to provide a separate context under which all the
hyperlaunch creation and construction logic would be done and then
destroyed as part of init finalization.

> For these reasons, given [1], also not to subvert the security
> framework as Daniel pointed out, I think I should go back to my own
> implementation [2][3] based on get_free_port. That is my preference
> because:
> 
> - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
> - adding skip_xsm introduces a component of risk (unless we make it
>   __init maybe?)
> - using get_free_port is trivial and doesn't pose the same issues
> 
> 
> Let's find all an agreement on how to move forward on this.
> 
> 
> [1] https://marc.info/?l=xen-devel&m=164194128922838
> [2] https://marc.info/?l=xen-devel&m=164203543615114
> [3] https://marc.info/?l=xen-devel&m=164203544615129
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Jan Beulich 2 years, 8 months ago
On 23.03.2022 01:22, Stefano Stabellini wrote:
> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>
>>> The xenstore event channel will be allocated for dom0less domains. It is
>>> necessary to have access to the evtchn_alloc_unbound function to do
>>> that, so make evtchn_alloc_unbound public.
>>>
>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>> originated from Xen before running any domains.)
>>>
>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> CC: Julien Grall <julien@xen.org>
>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: George Dunlap <george.dunlap@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Wei Liu <wl@xen.org>
>>> ---
>>> Changes v3:
>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>> ---
>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>  xen/include/xen/event.h    |  3 +++
>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>> index da88ad141a..be57d00a15 100644
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>      xsm_evtchn_close_post(chn);
>>>  }
>>>  
>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>  {
>>>      struct evtchn *chn;
>>>      struct domain *d;
>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>          ERROR_EXIT_DOM(port, d);
>>>      chn = evtchn_from_port(d, port);
>>>  
>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> -    if ( rc )
>>> -        goto out;
>>> +    if ( !skip_xsm )
>>> +    {
>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> +        if ( rc )
>>> +            goto out;
>>> +    }
>>
>> Please do not subvert the security framework because it causes an
>> inconvenience. As Jan recommended, work within the XSM check to allow
>> your access so that we may ensure it is done safely. If you need any
>> help making modifications to XSM, please do not hesitate to reach out as
>> I will gladly help.
> 
> Thank you!
> 
> First let me reply to Jan: this series is only introducing 1 more call
> to evtchn_alloc_unbound, which is to allocate the special xenstore event
> channel, the one configured via
> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> 
> It is not meant to be a generic function, and it is not meant to be
> called more than once. It could (should?) be __init.

How that? Its pre-existing use doesn't disappear, and requires it to be
non-__init.

> The existing XSM check in evtchn_alloc_unbound cannot work and should
> not work: it is based on the current domain having enough privileges to
> create the event channel. In this case, we have no current domain at
> all. The current domain is Xen itself.

And DOM_XEN cannot be given the appropriate permission, perhaps
explicitly when using a real policy and by default in dummy and SILO
modes?

Jan

> For these reasons, given [1], also not to subvert the security
> framework as Daniel pointed out, I think I should go back to my own
> implementation [2][3] based on get_free_port. That is my preference
> because:
> 
> - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
> - adding skip_xsm introduces a component of risk (unless we make it
>   __init maybe?)
> - using get_free_port is trivial and doesn't pose the same issues
> 
> 
> Let's find all an agreement on how to move forward on this.
> 
> 
> [1] https://marc.info/?l=xen-devel&m=164194128922838
> [2] https://marc.info/?l=xen-devel&m=164203543615114
> [3] https://marc.info/?l=xen-devel&m=164203544615129 
>
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Stefano Stabellini 2 years, 8 months ago
On Wed, 23 Mar 2022, Jan Beulich wrote:
> On 23.03.2022 01:22, Stefano Stabellini wrote:
> > On Tue, 15 Mar 2022, Daniel P. Smith wrote:
> >> On 1/28/22 16:33, Stefano Stabellini wrote:
> >>> From: Luca Miccio <lucmiccio@gmail.com>
> >>>
> >>> The xenstore event channel will be allocated for dom0less domains. It is
> >>> necessary to have access to the evtchn_alloc_unbound function to do
> >>> that, so make evtchn_alloc_unbound public.
> >>>
> >>> Add a skip_xsm parameter to allow disabling the XSM check in
> >>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> >>> originated from Xen before running any domains.)
> >>>
> >>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> >>> CC: Julien Grall <julien@xen.org>
> >>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> >>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> CC: George Dunlap <george.dunlap@citrix.com>
> >>> CC: Jan Beulich <jbeulich@suse.com>
> >>> CC: Wei Liu <wl@xen.org>
> >>> ---
> >>> Changes v3:
> >>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> >>> ---
> >>>  xen/common/event_channel.c | 13 ++++++++-----
> >>>  xen/include/xen/event.h    |  3 +++
> >>>  2 files changed, 11 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> >>> index da88ad141a..be57d00a15 100644
> >>> --- a/xen/common/event_channel.c
> >>> +++ b/xen/common/event_channel.c
> >>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
> >>>      xsm_evtchn_close_post(chn);
> >>>  }
> >>>  
> >>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> >>>  {
> >>>      struct evtchn *chn;
> >>>      struct domain *d;
> >>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >>>          ERROR_EXIT_DOM(port, d);
> >>>      chn = evtchn_from_port(d, port);
> >>>  
> >>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> >>> -    if ( rc )
> >>> -        goto out;
> >>> +    if ( !skip_xsm )
> >>> +    {
> >>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> >>> +        if ( rc )
> >>> +            goto out;
> >>> +    }
> >>
> >> Please do not subvert the security framework because it causes an
> >> inconvenience. As Jan recommended, work within the XSM check to allow
> >> your access so that we may ensure it is done safely. If you need any
> >> help making modifications to XSM, please do not hesitate to reach out as
> >> I will gladly help.
> > 
> > Thank you!
> > 
> > First let me reply to Jan: this series is only introducing 1 more call
> > to evtchn_alloc_unbound, which is to allocate the special xenstore event
> > channel, the one configured via
> > d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> > 
> > It is not meant to be a generic function, and it is not meant to be
> > called more than once. It could (should?) be __init.
> 
> How that? Its pre-existing use doesn't disappear, and requires it to be
> non-__init.

Sorry I meant the new function (calling get_free_port) for the new
use-case could be __init. The new function could be added to
xen/common/event_channel.c or to xen/arch/arm/domain_build.c.


> > The existing XSM check in evtchn_alloc_unbound cannot work and should
> > not work: it is based on the current domain having enough privileges to
> > create the event channel. In this case, we have no current domain at
> > all. The current domain is Xen itself.
> 
> And DOM_XEN cannot be given the appropriate permission, perhaps
> explicitly when using a real policy and by default in dummy and SILO
> modes?

The issue is that the check is based on "current", not one of the
domains passed as an argument to evtchn_alloc_unbound. Otherwise,
passing DOMID_XEN would be straightforward.

We would need to use a hack (like Daniel wrote in the other email) to
set the idle_domain temporarily as a privileged domain only for the sake
of passing an irrelevant (irrelevant as in "not relevant to this case")
XSM check. That cannot be an improvement. Also, setting current to a
"fake" domain is not great either.

In the specific case of dom0less and this patch, this is the only
instance of this issue and could be solved very straightforwardly by
calling get_free_port directly as we discussed [1].

I know Julien had some reservations about that. Let's try to find a
technical solution that makes everyone happy.

Maybe, instead of exporting get_free_port, we could create a new
function in xen/common/event_channel.c and mark it as __init? This way:
- we don't need to expose get_free_port
- the new function would only be __init anyway, so no risk of runtime
  misuse

What do you think?

[1] https://marc.info/?l=xen-devel&m=164197327305903
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Julien Grall 2 years, 8 months ago
Hi Stefano,

On 25/03/2022 00:30, Stefano Stabellini wrote:
> On Wed, 23 Mar 2022, Jan Beulich wrote:
>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>>>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>>>
>>>>> The xenstore event channel will be allocated for dom0less domains. It is
>>>>> necessary to have access to the evtchn_alloc_unbound function to do
>>>>> that, so make evtchn_alloc_unbound public.
>>>>>
>>>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>>>> originated from Xen before running any domains.)
>>>>>
>>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> CC: Julien Grall <julien@xen.org>
>>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> CC: George Dunlap <george.dunlap@citrix.com>
>>>>> CC: Jan Beulich <jbeulich@suse.com>
>>>>> CC: Wei Liu <wl@xen.org>
>>>>> ---
>>>>> Changes v3:
>>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>>>> ---
>>>>>   xen/common/event_channel.c | 13 ++++++++-----
>>>>>   xen/include/xen/event.h    |  3 +++
>>>>>   2 files changed, 11 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>>>> index da88ad141a..be57d00a15 100644
>>>>> --- a/xen/common/event_channel.c
>>>>> +++ b/xen/common/event_channel.c
>>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>>>       xsm_evtchn_close_post(chn);
>>>>>   }
>>>>>   
>>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>>>   {
>>>>>       struct evtchn *chn;
>>>>>       struct domain *d;
>>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>>           ERROR_EXIT_DOM(port, d);
>>>>>       chn = evtchn_from_port(d, port);
>>>>>   
>>>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> -    if ( rc )
>>>>> -        goto out;
>>>>> +    if ( !skip_xsm )
>>>>> +    {
>>>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> +        if ( rc )
>>>>> +            goto out;
>>>>> +    }
>>>>
>>>> Please do not subvert the security framework because it causes an
>>>> inconvenience. As Jan recommended, work within the XSM check to allow
>>>> your access so that we may ensure it is done safely. If you need any
>>>> help making modifications to XSM, please do not hesitate to reach out as
>>>> I will gladly help.
>>>
>>> Thank you!
>>>
>>> First let me reply to Jan: this series is only introducing 1 more call
>>> to evtchn_alloc_unbound, which is to allocate the special xenstore event
>>> channel, the one configured via
>>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
>>>
>>> It is not meant to be a generic function, and it is not meant to be
>>> called more than once. It could (should?) be __init.
>>
>> How that? Its pre-existing use doesn't disappear, and requires it to be
>> non-__init.
> 
> Sorry I meant the new function (calling get_free_port) for the new
> use-case could be __init. The new function could be added to
> xen/common/event_channel.c or to xen/arch/arm/domain_build.c.
> 
> 
>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>> not work: it is based on the current domain having enough privileges to
>>> create the event channel. In this case, we have no current domain at
>>> all. The current domain is Xen itself.
>>
>> And DOM_XEN cannot be given the appropriate permission, perhaps
>> explicitly when using a real policy and by default in dummy and SILO
>> modes?
> 
> The issue is that the check is based on "current", not one of the
> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> passing DOMID_XEN would be straightforward.
> 
> We would need to use a hack (like Daniel wrote in the other email) to
> set the idle_domain temporarily as a privileged domain only for the sake
> of passing an irrelevant (irrelevant as in "not relevant to this case")
> XSM check. That cannot be an improvement. Also, setting current to a
> "fake" domain is not great either.
I agree they are not great. But this needs to be compared with the other 
proposals.

AFAIU, your proposal is to duplicate code. This brings other risks such 
as duplicated bug and more code to maintain.

In your case, you only need one helper. But in some other context (e.g. 
Live-Update and it looks like Hyperlaunch), we may need to re-use more 
hypercalls code.

So to me, the idea of switching to a "fake" domain or bypassing the 
check is more appealing. I have a preference for the "fake" domain here.

> 
> In the specific case of dom0less and this patch, this is the only
> instance of this issue and could be solved very straightforwardly by
> calling get_free_port directly as we discussed [1].

Exporting get_free_port() is a no-go for me. This can be easily mis-used 
and in fact you already did it in your proposal by not using the proper 
locking (I appreciate this is meant to be boot code only).

> 
> I know Julien had some reservations about that. Let's try to find a
> technical solution that makes everyone happy.
> 
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>    misuse

I think the code duplication is short-sighted. I could possibly accept 
one instance, but I suspect the proposal [1] will end up to add more.
So IMHO we should try to resolve it generically now.

Cheers,

[1] 
<4836304496e6fbbea41348ed8cc9fcf6b0f3e893.1648049827.git.rahul.singh@arm.com>

-- 
Julien Grall
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Stefano Stabellini 2 years, 8 months ago
On Fri, 25 Mar 2022, Julien Grall wrote:
> So to me, the idea of switching to a "fake" domain or bypassing the check is
> more appealing. I have a preference for the "fake" domain here.

As a maintainer, I am not opposed to the "fake"/"contructor" domain
concept.  It all depends on how many instances of this issue we are
going to have.  This is the only one on xen-devel so far. I don't think
it is worth adding a constructor domain for one instance only.  But I
agree with you and Daniel that if we end up with several instances, then
the constructor domain approach is probably the best option overall.


As a contributor, sadly I won't be able to spend a lot of time on this
in the following months. If a significant rework is required, I don't
think I'll be able to do it, at least not for this Xen release (and it
would be nice to have dom0less PV drivers in the coming release.) If
Daniel is willing, I could add his "idle_domain is_priv" patch to this
series.  Not as clean as a proper constructor domain but it would work
and it would be simple. It could be evolved into a nicer constructor
domain later.

This is not my preference but I could do that if Julien and Jan prefer
this approach and if Daniel is happy to share his patch.


> AFAIU, your proposal is to duplicate code. This brings other risks such as
> duplicated bug and more code to maintain.

Got it. I'll make one last attempt at a proposal that doesn't involve
the fake constructor domain. The goal is to avoid code duplication while
providing a safe way forward to make progress with only a small amount
of changes. What if we:

- rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
- add a skip_xsm parameter to _evtchn_alloc_unbound
- introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
  false (same interface as today's evtchn_alloc_unbound)
- introduce an __init early_evtchn_alloc_unbound public function that
  sets skip_xsm to true

This way:
- we have a single implementation in _evtchn_alloc_unbound (no code
  duplication)
- the only function exposed that skips the XSM check is __init
- evtchn_alloc_unbound continue to have the XSM check same as today


E.g.:
static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
{
    ...
}

static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
{
    return _evtchn_alloc_unbound(alloc, false);    
}

int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
{
    return _evtchn_alloc_unbound(alloc, true);
}


Would this be good enough for now?
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 3/25/22 17:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote:
>> So to me, the idea of switching to a "fake" domain or bypassing the check is
>> more appealing. I have a preference for the "fake" domain here.
> 
> As a maintainer, I am not opposed to the "fake"/"contructor" domain
> concept.  It all depends on how many instances of this issue we are
> going to have.  This is the only one on xen-devel so far. I don't think
> it is worth adding a constructor domain for one instance only.  But I
> agree with you and Daniel that if we end up with several instances, then
> the constructor domain approach is probably the best option overall.
> 
> 
> As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.
> 
> 
>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>   false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>   sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>   duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>     ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, false);    
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

Please see the RFC patch I just posted[1], IMHO I think this is a safer
approach for this specific instance.

[1]
https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t

v/r,
dps
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Stefano Stabellini 2 years, 8 months ago
On Mon, 28 Mar 2022, Daniel P. Smith wrote:
> On 3/25/22 17:05, Stefano Stabellini wrote:
> > On Fri, 25 Mar 2022, Julien Grall wrote:
> >> So to me, the idea of switching to a "fake" domain or bypassing the check is
> >> more appealing. I have a preference for the "fake" domain here.
> > 
> > As a maintainer, I am not opposed to the "fake"/"contructor" domain
> > concept.  It all depends on how many instances of this issue we are
> > going to have.  This is the only one on xen-devel so far. I don't think
> > it is worth adding a constructor domain for one instance only.  But I
> > agree with you and Daniel that if we end up with several instances, then
> > the constructor domain approach is probably the best option overall.
> > 
> > 
> > As a contributor, sadly I won't be able to spend a lot of time on this
> > in the following months. If a significant rework is required, I don't
> > think I'll be able to do it, at least not for this Xen release (and it
> > would be nice to have dom0less PV drivers in the coming release.) If
> > Daniel is willing, I could add his "idle_domain is_priv" patch to this
> > series.  Not as clean as a proper constructor domain but it would work
> > and it would be simple. It could be evolved into a nicer constructor
> > domain later.
> > 
> > This is not my preference but I could do that if Julien and Jan prefer
> > this approach and if Daniel is happy to share his patch.
> > 
> > 
> >> AFAIU, your proposal is to duplicate code. This brings other risks such as
> >> duplicated bug and more code to maintain.
> > 
> > Got it. I'll make one last attempt at a proposal that doesn't involve
> > the fake constructor domain. The goal is to avoid code duplication while
> > providing a safe way forward to make progress with only a small amount
> > of changes. What if we:
> > 
> > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> > - add a skip_xsm parameter to _evtchn_alloc_unbound
> > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
> >   false (same interface as today's evtchn_alloc_unbound)
> > - introduce an __init early_evtchn_alloc_unbound public function that
> >   sets skip_xsm to true
> > 
> > This way:
> > - we have a single implementation in _evtchn_alloc_unbound (no code
> >   duplication)
> > - the only function exposed that skips the XSM check is __init
> > - evtchn_alloc_unbound continue to have the XSM check same as today
> > 
> > 
> > E.g.:
> > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> > {
> >     ...
> > }
> > 
> > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > {
> >     return _evtchn_alloc_unbound(alloc, false);    
> > }
> > 
> > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > {
> >     return _evtchn_alloc_unbound(alloc, true);
> > }
> > 
> > 
> > Would this be good enough for now?
> 
> Please see the RFC patch I just posted[1], IMHO I think this is a safer
> approach for this specific instance.
> 
> [1]
> https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t

I read it, the patch looks fine. I also tested it together with my
series and it solves the problem. With [1], it is just a matter of
making evtchn_alloc_unbound as is non-static.

If the other maintainers also agree with [1], then I'll just rebase on
it and limit my changes to exporting evtchn_alloc_unbound.
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 3/25/22 17:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote:
>> So to me, the idea of switching to a "fake" domain or bypassing the check is
>> more appealing. I have a preference for the "fake" domain here.
> 
> As a maintainer, I am not opposed to the "fake"/"contructor" domain
> concept.  It all depends on how many instances of this issue we are
> going to have.  This is the only one on xen-devel so far. I don't think
> it is worth adding a constructor domain for one instance only.  But I
> agree with you and Daniel that if we end up with several instances, then
> the constructor domain approach is probably the best option overall.

The constructor domain still needs more discussion and would likely be
part of a larger approach that will require buy-in from several
maintainers and should be looking to solve a more general problem
internal access control of which domain construction within the
hypervisor is just one case. For this I would be glad to start a working
group, for which the start of can add to the next community call agenda.

> As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.

I can look to spin out a general version of what I am doing, likely
exposed as an XSM call so it can be handled appropriately across policies.

>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>   false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>   sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>   duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>     ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, false);    
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

I guest the question is if it is okay for this to exist until the new
XSM calls are found to be acceptable and then this is reverted/changed
to the XSM calls?

v/r
dps
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Julien Grall 2 years, 8 months ago
Hi Stefano,

On 25/03/2022 21:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote: > As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.

This is still my preference because we are avoiding to push the problem 
to the unlucky person that will need to introduce another (or multiple) 
'skip_xsm'.

>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>    false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>    sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>    duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>      ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>      return _evtchn_alloc_unbound(alloc, false);
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>      return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

I would be OK with that. Note that, I think we need to protect the use 
of skip_xsm with evaluate_nospec().

Cheers,

-- 
Julien Grall
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 3/24/22 20:30, Stefano Stabellini wrote:
> On Wed, 23 Mar 2022, Jan Beulich wrote:
>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>>>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>>>
>>>>> The xenstore event channel will be allocated for dom0less domains. It is
>>>>> necessary to have access to the evtchn_alloc_unbound function to do
>>>>> that, so make evtchn_alloc_unbound public.
>>>>>
>>>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>>>> originated from Xen before running any domains.)
>>>>>
>>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> CC: Julien Grall <julien@xen.org>
>>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> CC: George Dunlap <george.dunlap@citrix.com>
>>>>> CC: Jan Beulich <jbeulich@suse.com>
>>>>> CC: Wei Liu <wl@xen.org>
>>>>> ---
>>>>> Changes v3:
>>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>>>> ---
>>>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>>>  xen/include/xen/event.h    |  3 +++
>>>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>>>> index da88ad141a..be57d00a15 100644
>>>>> --- a/xen/common/event_channel.c
>>>>> +++ b/xen/common/event_channel.c
>>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>>>      xsm_evtchn_close_post(chn);
>>>>>  }
>>>>>  
>>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>>>  {
>>>>>      struct evtchn *chn;
>>>>>      struct domain *d;
>>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>>          ERROR_EXIT_DOM(port, d);
>>>>>      chn = evtchn_from_port(d, port);
>>>>>  
>>>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> -    if ( rc )
>>>>> -        goto out;
>>>>> +    if ( !skip_xsm )
>>>>> +    {
>>>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> +        if ( rc )
>>>>> +            goto out;
>>>>> +    }
>>>>
>>>> Please do not subvert the security framework because it causes an
>>>> inconvenience. As Jan recommended, work within the XSM check to allow
>>>> your access so that we may ensure it is done safely. If you need any
>>>> help making modifications to XSM, please do not hesitate to reach out as
>>>> I will gladly help.
>>>
>>> Thank you!
>>>
>>> First let me reply to Jan: this series is only introducing 1 more call
>>> to evtchn_alloc_unbound, which is to allocate the special xenstore event
>>> channel, the one configured via
>>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
>>>
>>> It is not meant to be a generic function, and it is not meant to be
>>> called more than once. It could (should?) be __init.
>>
>> How that? Its pre-existing use doesn't disappear, and requires it to be
>> non-__init.
> 
> Sorry I meant the new function (calling get_free_port) for the new
> use-case could be __init. The new function could be added to
> xen/common/event_channel.c or to xen/arch/arm/domain_build.c.
> 
> 
>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>> not work: it is based on the current domain having enough privileges to
>>> create the event channel. In this case, we have no current domain at
>>> all. The current domain is Xen itself.
>>
>> And DOM_XEN cannot be given the appropriate permission, perhaps
>> explicitly when using a real policy and by default in dummy and SILO
>> modes?
> 
> The issue is that the check is based on "current", not one of the
> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> passing DOMID_XEN would be straightforward.
> 
> We would need to use a hack (like Daniel wrote in the other email) to
> set the idle_domain temporarily as a privileged domain only for the sake
> of passing an irrelevant (irrelevant as in "not relevant to this case")
> XSM check. That cannot be an improvement. Also, setting current to a
> "fake" domain is not great either.

My suggestion was not to intended to be simply a hack but looking at the
larger issue instead of simply doing a targeted fix for this one
instnace. While I cannot give an example right off hand, the reality is,
at least for hyperlaunch, that we cannot say for certain there will not
be further resource allocations that is protected by the security
framework and will require preliminary handling by the construction
logic in the hypervisor. The low-complexity approach is to address each
one in a case-by-case fashion using direct calls that go around the
security framework. A more security conscience, and higher complexity,
approach would be to consider a least-privilege approach and look at
introducing the ability to do controlled switching of contexts, i.e.
moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
granted only the necessary privileges to do the resource allocations in
which it is limited.

This is also not the first time this issue has come up, I don't recall
the exact thread but several months ago someone ran into the issue they
need to make a call to a resource function and was blocked by XSM
because DOMID_IDLE has no privileges. The reality is that the idea of
monolithic high-privileged entities is being dropped in favor of
least-privilege, and where possible hardware enforced, constraint. This
can be seen with Intel de-privileging SMM and running SMI handlers in
constrained ring 3. Arm is gaining capability pointers, CHERI, that will
enable the possibility for constrained, least-privileged kernel
subsystems. Would it not be advantageous for Xen to start moving in such
a direction that would enable it to provide a new level of safety and
security for consumers of Xen?

Coming back to the short-term, I would advocate for introducing the
concept and abstraction of constrained context switching through a set
of function calls, which would likely be under XSM to allow policy
enforcement. Likely the introductory implementation would just mask the
fact that it is just setting `is_privileged` for DOMID_IDLE. Future
evolution of the capability could see the introduction of new
"contexts", whether they are represented by a domain could be determined
then, and the ability to do controlled switching based on policy.

Just wanted to put more of my thinking out there before a path is taken.

> In the specific case of dom0less and this patch, this is the only
> instance of this issue and could be solved very straightforwardly by
> calling get_free_port directly as we discussed [1].
> 
> I know Julien had some reservations about that. Let's try to find a
> technical solution that makes everyone happy.
> 
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>   misuse
> 
> What do you think?
> 
> [1] https://marc.info/?l=xen-devel&m=164197327305903
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Julien Grall 2 years, 8 months ago
Hi Daniel,

On 25/03/2022 15:45, Daniel P. Smith wrote:
>>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>>> not work: it is based on the current domain having enough privileges to
>>>> create the event channel. In this case, we have no current domain at
>>>> all. The current domain is Xen itself.
>>>
>>> And DOM_XEN cannot be given the appropriate permission, perhaps
>>> explicitly when using a real policy and by default in dummy and SILO
>>> modes?
>>
>> The issue is that the check is based on "current", not one of the
>> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
>> passing DOMID_XEN would be straightforward.
>>
>> We would need to use a hack (like Daniel wrote in the other email) to
>> set the idle_domain temporarily as a privileged domain only for the sake
>> of passing an irrelevant (irrelevant as in "not relevant to this case")
>> XSM check. That cannot be an improvement. Also, setting current to a
>> "fake" domain is not great either.
> 
> My suggestion was not to intended to be simply a hack but looking at the
> larger issue instead of simply doing a targeted fix for this one
> instnace. While I cannot give an example right off hand, the reality is,
> at least for hyperlaunch, that we cannot say for certain there will not
> be further resource allocations that is protected by the security
> framework and will require preliminary handling by the construction
> logic in the hypervisor. The low-complexity approach is to address each
> one in a case-by-case fashion using direct calls that go around the
> security framework. A more security conscience, and higher complexity,
> approach would be to consider a least-privilege approach and look at
> introducing the ability to do controlled switching of contexts, i.e.
> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
> granted only the necessary privileges to do the resource allocations in
> which it is limited.
> 
> This is also not the first time this issue has come up, I don't recall
> the exact thread but several months ago someone ran into the issue they
> need to make a call to a resource function and was blocked by XSM
> because DOMID_IDLE has no privileges.

This was in the context of Live-Updating Xen. We are trying to re-use as 
much as possible the code used by the hypercalls. Those functions may 
contain xsm check that would fail because current would be an idle vCPU.

For the full context:

https://lore.kernel.org/xen-devel/bfd645cf42ef7786183be15c222ad04beed362c0.camel@xen.org/

> The reality is that the idea of
> monolithic high-privileged entities is being dropped in favor of
> least-privilege, and where possible hardware enforced, constraint. This
> can be seen with Intel de-privileging SMM and running SMI handlers in
> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
> enable the possibility for constrained, least-privileged kernel
> subsystems. Would it not be advantageous for Xen to start moving in such
> a direction that would enable it to provide a new level of safety and
> security for consumers of Xen?
> 
> Coming back to the short-term, I would advocate for introducing the
> concept and abstraction of constrained context switching through a set
> of function calls, which would likely be under XSM to allow policy
> enforcement. Likely the introductory implementation would just mask the
> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
> evolution of the capability could see the introduction of new
> "contexts", whether they are represented by a domain could be determined
> then, and the ability to do controlled switching based on policy.

+1 with this idea.

Cheers,

-- 
Julien Grall
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Jason Andryuk 2 years, 8 months ago
On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote:
>
> On 3/24/22 20:30, Stefano Stabellini wrote:
> > On Wed, 23 Mar 2022, Jan Beulich wrote:
> >> On 23.03.2022 01:22, Stefano Stabellini wrote:
> >>> The existing XSM check in evtchn_alloc_unbound cannot work and should
> >>> not work: it is based on the current domain having enough privileges to
> >>> create the event channel. In this case, we have no current domain at
> >>> all. The current domain is Xen itself.
> >>
> >> And DOM_XEN cannot be given the appropriate permission, perhaps
> >> explicitly when using a real policy and by default in dummy and SILO
> >> modes?
> >
> > The issue is that the check is based on "current", not one of the
> > domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> > passing DOMID_XEN would be straightforward.
> >
> > We would need to use a hack (like Daniel wrote in the other email) to
> > set the idle_domain temporarily as a privileged domain only for the sake
> > of passing an irrelevant (irrelevant as in "not relevant to this case")
> > XSM check. That cannot be an improvement. Also, setting current to a
> > "fake" domain is not great either.
>
> My suggestion was not to intended to be simply a hack but looking at the
> larger issue instead of simply doing a targeted fix for this one
> instnace. While I cannot give an example right off hand, the reality is,
> at least for hyperlaunch, that we cannot say for certain there will not
> be further resource allocations that is protected by the security
> framework and will require preliminary handling by the construction
> logic in the hypervisor. The low-complexity approach is to address each
> one in a case-by-case fashion using direct calls that go around the
> security framework. A more security conscience, and higher complexity,
> approach would be to consider a least-privilege approach and look at
> introducing the ability to do controlled switching of contexts, i.e.
> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
> granted only the necessary privileges to do the resource allocations in
> which it is limited.
>
> This is also not the first time this issue has come up, I don't recall
> the exact thread but several months ago someone ran into the issue they
> need to make a call to a resource function and was blocked by XSM
> because DOMID_IDLE has no privileges. The reality is that the idea of
> monolithic high-privileged entities is being dropped in favor of
> least-privilege, and where possible hardware enforced, constraint. This
> can be seen with Intel de-privileging SMM and running SMI handlers in
> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
> enable the possibility for constrained, least-privileged kernel
> subsystems. Would it not be advantageous for Xen to start moving in such
> a direction that would enable it to provide a new level of safety and
> security for consumers of Xen?
>
> Coming back to the short-term, I would advocate for introducing the
> concept and abstraction of constrained context switching through a set
> of function calls, which would likely be under XSM to allow policy
> enforcement. Likely the introductory implementation would just mask the
> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
> evolution of the capability could see the introduction of new
> "contexts", whether they are represented by a domain could be determined
> then, and the ability to do controlled switching based on policy.

For the specific case of evtchn_alloc_unbound, Flask's
xsm_evtchn_unbound has a side effect of labeling the event channel.
So skipping the hook will have unintended consequences for Flask.

xsm_evtchn_unbound could be split in two to have an access piece and a
labeling piece.  The access piece is run at hypercall entry, and the
labeling is still done in evtchn_alloc_unbound.  For Flask, labeling
depends on the two domain endpoints, but not current.

More generally, it seems to me there are too many xsm checks in the
middle of functions/operations.  They are fine for a normal entry via
hypercall, but they interfere with Xen's internal operations.  Xen
shouldn't be restricted in its own operations.  The live update people
hit it with domain creation, and I just posted a patch for
unmap_domain_pirq.

It would be more obvious for auditing if each hypercall entrypoint
applied xsm checks.  Make the allow/deny decision as early as
possible.  Then a worker function would be easily callable for the
Xen-internal case.  The flip side of that is the xsm hook may need
sub-op specific data to make its decision, so it fits to put it in the
sub-op function.  It seems to me the location of hooks was determined
by where the data they need is already available.  Re-arranging hooks
may require some duplication.

The xsm controls should clearly apply to the DomUs and other entities
managed by Xen.  xsm restricting Xen itself seems wrong.  Having
internal operations get denied by xsm may unintentionally subvert Xen
itself.

Yes, moving checks outward makes for an un-restricted middle.  But the
security server running in the same address space isn't going to help
against code exec inside the hypervisor.

I think I view it like this:  Why is a xen internal operation subject
to a security check?  When would one ever want to deny a xen internal
operation?

Regards,
Jason
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Daniel P. Smith 2 years, 8 months ago
On 3/25/22 12:52, Jason Andryuk wrote:
> On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote:
>>
>> On 3/24/22 20:30, Stefano Stabellini wrote:
>>> On Wed, 23 Mar 2022, Jan Beulich wrote:
>>>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>>>> not work: it is based on the current domain having enough privileges to
>>>>> create the event channel. In this case, we have no current domain at
>>>>> all. The current domain is Xen itself.
>>>>
>>>> And DOM_XEN cannot be given the appropriate permission, perhaps
>>>> explicitly when using a real policy and by default in dummy and SILO
>>>> modes?
>>>
>>> The issue is that the check is based on "current", not one of the
>>> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
>>> passing DOMID_XEN would be straightforward.
>>>
>>> We would need to use a hack (like Daniel wrote in the other email) to
>>> set the idle_domain temporarily as a privileged domain only for the sake
>>> of passing an irrelevant (irrelevant as in "not relevant to this case")
>>> XSM check. That cannot be an improvement. Also, setting current to a
>>> "fake" domain is not great either.
>>
>> My suggestion was not to intended to be simply a hack but looking at the
>> larger issue instead of simply doing a targeted fix for this one
>> instnace. While I cannot give an example right off hand, the reality is,
>> at least for hyperlaunch, that we cannot say for certain there will not
>> be further resource allocations that is protected by the security
>> framework and will require preliminary handling by the construction
>> logic in the hypervisor. The low-complexity approach is to address each
>> one in a case-by-case fashion using direct calls that go around the
>> security framework. A more security conscience, and higher complexity,
>> approach would be to consider a least-privilege approach and look at
>> introducing the ability to do controlled switching of contexts, i.e.
>> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
>> granted only the necessary privileges to do the resource allocations in
>> which it is limited.
>>
>> This is also not the first time this issue has come up, I don't recall
>> the exact thread but several months ago someone ran into the issue they
>> need to make a call to a resource function and was blocked by XSM
>> because DOMID_IDLE has no privileges. The reality is that the idea of
>> monolithic high-privileged entities is being dropped in favor of
>> least-privilege, and where possible hardware enforced, constraint. This
>> can be seen with Intel de-privileging SMM and running SMI handlers in
>> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
>> enable the possibility for constrained, least-privileged kernel
>> subsystems. Would it not be advantageous for Xen to start moving in such
>> a direction that would enable it to provide a new level of safety and
>> security for consumers of Xen?
>>
>> Coming back to the short-term, I would advocate for introducing the
>> concept and abstraction of constrained context switching through a set
>> of function calls, which would likely be under XSM to allow policy
>> enforcement. Likely the introductory implementation would just mask the
>> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
>> evolution of the capability could see the introduction of new
>> "contexts", whether they are represented by a domain could be determined
>> then, and the ability to do controlled switching based on policy.
> 
> For the specific case of evtchn_alloc_unbound, Flask's
> xsm_evtchn_unbound has a side effect of labeling the event channel.
> So skipping the hook will have unintended consequences for Flask.
> 
> xsm_evtchn_unbound could be split in two to have an access piece and a
> labeling piece.  The access piece is run at hypercall entry, and the
> labeling is still done in evtchn_alloc_unbound.  For Flask, labeling
> depends on the two domain endpoints, but not current.
> 
> More generally, it seems to me there are too many xsm checks in the
> middle of functions/operations.  They are fine for a normal entry via
> hypercall, but they interfere with Xen's internal operations.  Xen
> shouldn't be restricted in its own operations.  The live update people
> hit it with domain creation, and I just posted a patch for
> unmap_domain_pirq.
> 
> It would be more obvious for auditing if each hypercall entrypoint
> applied xsm checks.  Make the allow/deny decision as early as
> possible.  Then a worker function would be easily callable for the
> Xen-internal case.  The flip side of that is the xsm hook may need
> sub-op specific data to make its decision, so it fits to put it in the
> sub-op function.  It seems to me the location of hooks was determined
> by where the data they need is already available.  Re-arranging hooks
> may require some duplication.

While it can be inconvenient in some cases, one of the purposes of flask
is to provide fine(er) grained access control over resources and for Xen
the path to a resource is often through a multiplexed interface. As such
I would be very cautious in considering the addition of distance between
the time of check and the time of access to a resource.

It has been some time since it has occurred but I am aware that there
was an audit conducted to ensure all necessary resources were covered
and the placement of decision points were in the correct locations. With
that said, the code has evolved and it may be time for another audit
(though it would take some cycles to do such an audit).

> The xsm controls should clearly apply to the DomUs and other entities
> managed by Xen.  xsm restricting Xen itself seems wrong.  Having
> internal operations get denied by xsm may unintentionally subvert Xen
> itself.
> 
> Yes, moving checks outward makes for an un-restricted middle.  But the
> security server running in the same address space isn't going to help
> against code exec inside the hypervisor.
> 
> I think I view it like this:  Why is a xen internal operation subject
> to a security check?  When would one ever want to deny a xen internal
> operation?

In the past this was the standard of thinking but the reality of the
last few years has demonstrated this approach can no longer be relied
upon. If this approach was safe, then Intel would have had no reason to
put the work into providing the necessary mechanisms to de-privilege SMI
handlers, just to pick a recent example. Hardware changes are coming,
and in the case of OpenPOWER and Arm it is already here, that will
enable kernels and hypervisors to de-privilege sections of their
processing logic depending on the risk, for example ensuring code
processing input from a hypercall cannot reach memory used by the
security server.

My point here is that instead of cementing a historical approach into
the code base, that perhaps it would be worth considering using a more
flexible approach that provides the necessary modularity to enable the
introduction of this kind of capability, even if today it is merely a
facade, without having to rework the interfaces to remove all the
secondary direct paths.

V/r,
dps
Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
Posted by Jan Beulich 2 years, 8 months ago
On 25.03.2022 01:30, Stefano Stabellini wrote:
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>   misuse
> 
> What do you think?

Maybe. Such a function would want to serve both your an Daniel's purpose
then.

Jan