xen/common/domain.c | 27 +++++++++++++++++++++------ xen/common/domctl.c | 7 ++----- 2 files changed, 23 insertions(+), 11 deletions(-)
When using XSM Flask, passing DOMID_INVALID will result in a NULL pointer
reference from the passing of NULL as the target domain to
xsm_get_domain_state(). Simply not invoking xsm_get_domain_state() when the
target domain is NULL opens the opportunity to circumvent the XSM
get_domain_state access check. This is due to the fact that the call to
xsm_domctl() for get_domain_state op is a no-op check, deferring to
xsm_get_domain_state().
Modify the helper get_domain_state() to ensure the requesting domain has
get_domain_state access for the target domain, whether the target domain is
explicitly set or implicitly determined with a domain state search. In the case
of access not being allowed for a domain found during an implicit search, the
search will continue to the next domain whose state has changed.
Fixes: 3ad3df1bd0aa ("xen: add new domctl get_domain_state")
Reported-by: Chris Rogers <rogersc@ainfosec.com>
Reported-by: Dmytro Firsov <dmytro_firsov@epam.com>
Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
---
Changes in v2:
- fix commit message
- init dom as -1
- rework loop logic to use test_and_clear_bit()
---
xen/common/domain.c | 27 +++++++++++++++++++++------
xen/common/domctl.c | 7 ++-----
2 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index de6fdf59236e..2ffec331a8d1 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
domid_t *domid)
{
- unsigned int dom;
+ unsigned int dom = -1;
int rc = -ENOENT;
struct domain *hdl;
@@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
if ( d )
{
+ rc = xsm_get_domain_state(XSM_XS_PRIV, d);
+ if ( rc )
+ return rc;
+
set_domain_state_info(info, d);
return 0;
@@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
while ( dom_state_changed )
{
- dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
+ dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
if ( dom >= DOMID_FIRST_RESERVED )
break;
+
+ d = rcu_lock_domain_by_id(dom);
+ if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
+ {
+ rcu_unlock_domain(d);
+ d = NULL;
+ continue;
+ }
+
if ( test_and_clear_bit(dom, dom_state_changed) )
{
*domid = dom;
- d = rcu_lock_domain_by_id(dom);
-
if ( d )
{
set_domain_state_info(info, d);
-
rcu_unlock_domain(d);
}
else
memset(info, 0, sizeof(*info));
rc = 0;
-
break;
}
+
+ if ( d )
+ {
+ rcu_unlock_domain(d);
+ d = NULL;
+ }
}
out:
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 29a7726d32d0..2eedc639c72a 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
break;
case XEN_DOMCTL_get_domain_state:
- ret = xsm_get_domain_state(XSM_XS_PRIV, d);
- if ( ret )
- break;
-
- copyback = 1;
ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
+ if ( !ret )
+ copyback = 1;
break;
default:
--
2.39.5
On 18.02.2026 20:08, Daniel P. Smith wrote:
> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>
> while ( dom_state_changed )
> {
> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
> if ( dom >= DOMID_FIRST_RESERVED )
> break;
> +
> + d = rcu_lock_domain_by_id(dom);
> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
> + {
> + rcu_unlock_domain(d);
> + d = NULL;
This looks unnecessary; the next loop iteration will set d unconditionally,
and d isn't (and wasn't) used past the loop. Plus there is also no such
clearing after the other rcu_unlock_domain().
> + continue;
> + }
This cleanup here also is redundant with the one done further down. Imo where
possible we should prefer to have only a single such instance, which looks to
be easily possible by using ...
> if ( test_and_clear_bit(dom, dom_state_changed) )
if ( (!d || !xsm_get_domain_state(XSM_XS_PRIV, d)) &&
test_and_clear_bit(dom, dom_state_changed) )
or
if ( (d && xsm_get_domain_state(XSM_XS_PRIV, d)) ||
!test_and_clear_bit(dom, dom_state_changed) )
{
...
continue;
}
albeit then the reduction of indentation of the subsequent code would cause
quite a bit more code churn.
> {
> *domid = dom;
>
> - d = rcu_lock_domain_by_id(dom);
> -
> if ( d )
> {
> set_domain_state_info(info, d);
> -
> rcu_unlock_domain(d);
> }
> else
> memset(info, 0, sizeof(*info));
>
> rc = 0;
> -
> break;
I don't think the blank lines need dropping for the purpose of the patch?
Yes, they may seem excessive, but nevertheless some prefer to have rather
too many of them than too few. (Personally I don't mind their removal,
but that really doesn't look to belong here.)
> }
> +
> + if ( d )
> + {
> + rcu_unlock_domain(d);
> + d = NULL;
> + }
> }
>
> out:
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> break;
>
> case XEN_DOMCTL_get_domain_state:
> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
> - if ( ret )
> - break;
> -
> - copyback = 1;
> ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
> + if ( !ret )
> + copyback = 1;
Nit: As you need to touch this, please switch to using "true".
Jan
On 2/19/26 06:11, Jan Beulich wrote:
> On 18.02.2026 20:08, Daniel P. Smith wrote:
>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>
>> while ( dom_state_changed )
>> {
>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>> if ( dom >= DOMID_FIRST_RESERVED )
>> break;
>> +
>> + d = rcu_lock_domain_by_id(dom);
>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>> + {
>> + rcu_unlock_domain(d);
>> + d = NULL;
>
> This looks unnecessary; the next loop iteration will set d unconditionally,
> and d isn't (and wasn't) used past the loop. Plus there is also no such
> clearing after the other rcu_unlock_domain().
>
If the src domain didn't have the permission on any of the target
domains, then *d will leak the last domain checked back to the caller.
While I didn't see it being used after the call site, it's a good
principle not to leak from a function an object for which access was denied.
This is not an issue for the later location where rcu_unlock_domain() is
called, because at that point the src domain does have domain state
privilege for the target domain.
>> + continue;
>> + }
>
> This cleanup here also is redundant with the one done further down. Imo where
> possible we should prefer to have only a single such instance, which looks to
> be easily possible by using ...
>
>> if ( test_and_clear_bit(dom, dom_state_changed) )
>
>
> if ( (!d || !xsm_get_domain_state(XSM_XS_PRIV, d)) &&
> test_and_clear_bit(dom, dom_state_changed) )
>
> or
>
> if ( (d && xsm_get_domain_state(XSM_XS_PRIV, d)) ||
> !test_and_clear_bit(dom, dom_state_changed) )
> {
> ...
> continue;
> }
>
> albeit then the reduction of indentation of the subsequent code would cause
> quite a bit more code churn.
>
I would prefer the latter to keep the clearing of d to NULL only when
access was denied.
>> {
>> *domid = dom;
>>
>> - d = rcu_lock_domain_by_id(dom);
>> -
>> if ( d )
>> {
>> set_domain_state_info(info, d);
>> -
>> rcu_unlock_domain(d);
>> }
>> else
>> memset(info, 0, sizeof(*info));
>>
>> rc = 0;
>> -
>> break;
>
> I don't think the blank lines need dropping for the purpose of the patch?
> Yes, they may seem excessive, but nevertheless some prefer to have rather
> too many of them than too few. (Personally I don't mind their removal,
> but that really doesn't look to belong here.)
>
Okay, but as you state above, with the compounding of the if statement,
the indentation will go away and the whole section will a remove/add
section to the diff. Would you still want the line spacing maintained?
>> }
>> +
>> + if ( d )
>> + {
>> + rcu_unlock_domain(d);
>> + d = NULL;
>> + }
>> }
>>
>> out:
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>> break;
>>
>> case XEN_DOMCTL_get_domain_state:
>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
>> - if ( ret )
>> - break;
>> -
>> - copyback = 1;
>> ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
>> + if ( !ret )
>> + copyback = 1;
>
> Nit: As you need to touch this, please switch to using "true".
Just to be clear, you mean switching to a conditional statement vs
testing for not zero?
v/r,
dps
On 19.02.2026 13:34, Daniel P. Smith wrote:
> On 2/19/26 06:11, Jan Beulich wrote:
>> On 18.02.2026 20:08, Daniel P. Smith wrote:
>>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> while ( dom_state_changed )
>>> {
>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>> if ( dom >= DOMID_FIRST_RESERVED )
>>> break;
>>> +
>>> + d = rcu_lock_domain_by_id(dom);
>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>> + {
>>> + rcu_unlock_domain(d);
>>> + d = NULL;
>>
>> This looks unnecessary; the next loop iteration will set d unconditionally,
>> and d isn't (and wasn't) used past the loop. Plus there is also no such
>> clearing after the other rcu_unlock_domain().
>>
>
> If the src domain didn't have the permission on any of the target
> domains, then *d will leak the last domain checked back to the caller.
> While I didn't see it being used after the call site, it's a good
> principle not to leak from a function an object for which access was denied.
I don't follow: What do you mean by "*d will leak"? d is local to this function;
no domain pointer is being returned.
>>> + continue;
>>> + }
>>
>> This cleanup here also is redundant with the one done further down. Imo where
>> possible we should prefer to have only a single such instance, which looks to
>> be easily possible by using ...
>>
>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>
>>
>> if ( (!d || !xsm_get_domain_state(XSM_XS_PRIV, d)) &&
>> test_and_clear_bit(dom, dom_state_changed) )
>>
>> or
>>
>> if ( (d && xsm_get_domain_state(XSM_XS_PRIV, d)) ||
>> !test_and_clear_bit(dom, dom_state_changed) )
>> {
>> ...
>> continue;
>> }
>>
>> albeit then the reduction of indentation of the subsequent code would cause
>> quite a bit more code churn.
>
> I would prefer the latter to keep the clearing of d to NULL only when
> access was denied.
I again don't understand, as ...
>>> {
>>> *domid = dom;
>>>
>>> - d = rcu_lock_domain_by_id(dom);
>>> -
>>> if ( d )
>>> {
>>> set_domain_state_info(info, d);
>>> -
>>> rcu_unlock_domain(d);
>>> }
>>> else
>>> memset(info, 0, sizeof(*info));
>>>
>>> rc = 0;
>>> -
>>> break;
... this code would remain unchanged apart from the reduced indentation.
>> I don't think the blank lines need dropping for the purpose of the patch?
>> Yes, they may seem excessive, but nevertheless some prefer to have rather
>> too many of them than too few. (Personally I don't mind their removal,
>> but that really doesn't look to belong here.)
>
> Okay, but as you state above, with the compounding of the if statement,
> the indentation will go away and the whole section will a remove/add
> section to the diff. Would you still want the line spacing maintained?
As said, I don't mind those particular blank lines being dropped (if that's
because adjacent lines are touched anyway).
>>> --- a/xen/common/domctl.c
>>> +++ b/xen/common/domctl.c
>>> @@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>> break;
>>>
>>> case XEN_DOMCTL_get_domain_state:
>>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
>>> - if ( ret )
>>> - break;
>>> -
>>> - copyback = 1;
>>> ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
>>> + if ( !ret )
>>> + copyback = 1;
>>
>> Nit: As you need to touch this, please switch to using "true".
>
> Just to be clear, you mean switching to a conditional statement vs
> testing for not zero?
No. I mean using "true" in place of "1", since copyback is of type "bool".
Jan
On 2/19/26 08:23, Jan Beulich wrote:
> On 19.02.2026 13:34, Daniel P. Smith wrote:
>> On 2/19/26 06:11, Jan Beulich wrote:
>>> On 18.02.2026 20:08, Daniel P. Smith wrote:
>>>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>>
>>>> while ( dom_state_changed )
>>>> {
>>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>>> if ( dom >= DOMID_FIRST_RESERVED )
>>>> break;
>>>> +
>>>> + d = rcu_lock_domain_by_id(dom);
>>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>>> + {
>>>> + rcu_unlock_domain(d);
>>>> + d = NULL;
>>>
>>> This looks unnecessary; the next loop iteration will set d unconditionally,
>>> and d isn't (and wasn't) used past the loop. Plus there is also no such
>>> clearing after the other rcu_unlock_domain().
>>>
>>
>> If the src domain didn't have the permission on any of the target
>> domains, then *d will leak the last domain checked back to the caller.
>> While I didn't see it being used after the call site, it's a good
>> principle not to leak from a function an object for which access was denied.
>
> I don't follow: What do you mean by "*d will leak"? d is local to this function;
> no domain pointer is being returned.
>
My bad, you are correct, will drop.
>>>> + continue;
>>>> + }
>>>
>>> This cleanup here also is redundant with the one done further down. Imo where
>>> possible we should prefer to have only a single such instance, which looks to
>>> be easily possible by using ...
>>>
>>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>>
>>>
>>> if ( (!d || !xsm_get_domain_state(XSM_XS_PRIV, d)) &&
>>> test_and_clear_bit(dom, dom_state_changed) )
>>>
>>> or
>>>
>>> if ( (d && xsm_get_domain_state(XSM_XS_PRIV, d)) ||
>>> !test_and_clear_bit(dom, dom_state_changed) )
>>> {
>>> ...
>>> continue;
>>> }
>>>
>>> albeit then the reduction of indentation of the subsequent code would cause
>>> quite a bit more code churn.
>>
>> I would prefer the latter to keep the clearing of d to NULL only when
>> access was denied.
>
> I again don't understand, as ...
>
>>>> {
>>>> *domid = dom;
>>>>
>>>> - d = rcu_lock_domain_by_id(dom);
>>>> -
>>>> if ( d )
>>>> {
>>>> set_domain_state_info(info, d);
>>>> -
>>>> rcu_unlock_domain(d);
>>>> }
>>>> else
>>>> memset(info, 0, sizeof(*info));
>>>>
>>>> rc = 0;
>>>> -
>>>> break;
>
> ... this code would remain unchanged apart from the reduced indentation.
>
>>> I don't think the blank lines need dropping for the purpose of the patch?
>>> Yes, they may seem excessive, but nevertheless some prefer to have rather
>>> too many of them than too few. (Personally I don't mind their removal,
>>> but that really doesn't look to belong here.)
>>
>> Okay, but as you state above, with the compounding of the if statement,
>> the indentation will go away and the whole section will a remove/add
>> section to the diff. Would you still want the line spacing maintained?
>
> As said, I don't mind those particular blank lines being dropped (if that's
> because adjacent lines are touched anyway).
>
ack.
>>>> --- a/xen/common/domctl.c
>>>> +++ b/xen/common/domctl.c
>>>> @@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>>> break;
>>>>
>>>> case XEN_DOMCTL_get_domain_state:
>>>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
>>>> - if ( ret )
>>>> - break;
>>>> -
>>>> - copyback = 1;
>>>> ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
>>>> + if ( !ret )
>>>> + copyback = 1;
>>>
>>> Nit: As you need to touch this, please switch to using "true".
>>
>> Just to be clear, you mean switching to a conditional statement vs
>> testing for not zero?
>
> No. I mean using "true" in place of "1", since copyback is of type "bool".
>
ack.
On 2026-02-18 14:08, Daniel P. Smith wrote:
> When using XSM Flask, passing DOMID_INVALID will result in a NULL pointer
> reference from the passing of NULL as the target domain to
> xsm_get_domain_state(). Simply not invoking xsm_get_domain_state() when the
> target domain is NULL opens the opportunity to circumvent the XSM
> get_domain_state access check. This is due to the fact that the call to
> xsm_domctl() for get_domain_state op is a no-op check, deferring to
> xsm_get_domain_state().
>
> Modify the helper get_domain_state() to ensure the requesting domain has
> get_domain_state access for the target domain, whether the target domain is
> explicitly set or implicitly determined with a domain state search. In the case
> of access not being allowed for a domain found during an implicit search, the
> search will continue to the next domain whose state has changed.
>
> Fixes: 3ad3df1bd0aa ("xen: add new domctl get_domain_state")
> Reported-by: Chris Rogers <rogersc@ainfosec.com>
> Reported-by: Dmytro Firsov <dmytro_firsov@epam.com>
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> ---
> Changes in v2:
> - fix commit message
> - init dom as -1
> - rework loop logic to use test_and_clear_bit()
> ---
> xen/common/domain.c | 27 +++++++++++++++++++++------
> xen/common/domctl.c | 7 ++-----
> 2 files changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index de6fdf59236e..2ffec331a8d1 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
> int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
> domid_t *domid)
> {
> - unsigned int dom;
> + unsigned int dom = -1;
> int rc = -ENOENT;
> struct domain *hdl;
>
> @@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>
> if ( d )
> {
> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
> + if ( rc )
> + return rc;
> +
> set_domain_state_info(info, d);
>
> return 0;
> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
Between the two hunks is this:
hdl = lock_dom_exc_handler();
/*
* Only domain registered for VIRQ_DOM_EXC event is allowed to query
* domains having changed state.
*/
if ( current->domain != hdl )
{
rc = -EACCES;
goto out;
}
So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>
> while ( dom_state_changed )
> {
> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
> if ( dom >= DOMID_FIRST_RESERVED )
> break;
> +
> + d = rcu_lock_domain_by_id(dom);
> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
... if the VIRQ_DOM_EXC owner is denied for domain d ...
> + {
> + rcu_unlock_domain(d);
> + d = NULL;
> + continue;
... the caller would continue ...
> + }
> +
> if ( test_and_clear_bit(dom, dom_state_changed) )
... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
always get to clear the bit even if it cannot see the result? Is this
too much of a corner case to care about?
The patch generally seems okay aside from this.
> {
> *domid = dom;
>
> - d = rcu_lock_domain_by_id(dom);
> -
> if ( d )
> {
> set_domain_state_info(info, d);
> -
> rcu_unlock_domain(d);
> }
> else
> memset(info, 0, sizeof(*info));
>
> rc = 0;
> -
> break;
> }
> +
> + if ( d )
> + {
> + rcu_unlock_domain(d);
> + d = NULL;
> + }
> }
>
> out:
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 29a7726d32d0..2eedc639c72a 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -860,12 +860,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> break;
>
> case XEN_DOMCTL_get_domain_state:
> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
With the initial NULL deref issue, I wondered if this wouldn't be better
off modified to drop the d param - xsm_get_domain_state(XSM_XS_PRIV) -
and changing flask permissions like:
allow dom0_t xen_t:xen get_domain_state;
That would gate the external call, and individual domain permissions
could be checked with xsm_getdomaininfo(), or a new hook if you don't
want to re-use.
But as your approach avoids passing NULL, it seems okay to me. It also
doesn't change the flask policy, which is nice.
Thanks,
Jason
> - if ( ret )
> - break;
> -
> - copyback = 1;
> ret = get_domain_state(&op->u.get_domain_state, d, &op->domain);
> + if ( !ret )
> + copyback = 1;
> break;
>
> default:
On 2/18/26 18:04, Jason Andryuk wrote:
> On 2026-02-18 14:08, Daniel P. Smith wrote:
>> When using XSM Flask, passing DOMID_INVALID will result in a NULL pointer
>> reference from the passing of NULL as the target domain to
>> xsm_get_domain_state(). Simply not invoking xsm_get_domain_state()
>> when the
>> target domain is NULL opens the opportunity to circumvent the XSM
>> get_domain_state access check. This is due to the fact that the call to
>> xsm_domctl() for get_domain_state op is a no-op check, deferring to
>> xsm_get_domain_state().
>>
>> Modify the helper get_domain_state() to ensure the requesting domain has
>> get_domain_state access for the target domain, whether the target
>> domain is
>> explicitly set or implicitly determined with a domain state search. In
>> the case
>> of access not being allowed for a domain found during an implicit
>> search, the
>> search will continue to the next domain whose state has changed.
>>
>> Fixes: 3ad3df1bd0aa ("xen: add new domctl get_domain_state")
>> Reported-by: Chris Rogers <rogersc@ainfosec.com>
>> Reported-by: Dmytro Firsov <dmytro_firsov@epam.com>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> ---
>> Changes in v2:
>> - fix commit message
>> - init dom as -1
>> - rework loop logic to use test_and_clear_bit()
>> ---
>> xen/common/domain.c | 27 +++++++++++++++++++++------
>> xen/common/domctl.c | 7 ++-----
>> 2 files changed, 23 insertions(+), 11 deletions(-)
>>
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index de6fdf59236e..2ffec331a8d1 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct
>> xen_domctl_get_domain_state *info,
>> int get_domain_state(struct xen_domctl_get_domain_state *info,
>> struct domain *d,
>> domid_t *domid)
>> {
>> - unsigned int dom;
>> + unsigned int dom = -1;
>> int rc = -ENOENT;
>> struct domain *hdl;
>> @@ -219,6 +219,10 @@ int get_domain_state(struct
>> xen_domctl_get_domain_state *info, struct domain *d,
>> if ( d )
>> {
>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>> + if ( rc )
>> + return rc;
>> +
>> set_domain_state_info(info, d);
>> return 0;
>> @@ -238,28 +242,39 @@ int get_domain_state(struct
>> xen_domctl_get_domain_state *info, struct domain *d,
>
> Between the two hunks is this:
>
> hdl = lock_dom_exc_handler();
>
> /*
> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
> * domains having changed state.
> */
> if ( current->domain != hdl )
> {
> rc = -EACCES;
> goto out;
> }
>
> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>
>> while ( dom_state_changed )
>> {
>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>> if ( dom >= DOMID_FIRST_RESERVED )
>> break;
>> +
>> + d = rcu_lock_domain_by_id(dom);
>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>
> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>
>> + {
>> + rcu_unlock_domain(d);
>> + d = NULL;
>> + continue;
>
> ... the caller would continue ...
>
>> + }
>> +
>> if ( test_and_clear_bit(dom, dom_state_changed) )
>
> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
> always get to clear the bit even if it cannot see the result? Is this
> too much of a corner case to care about?
If we want to go down this discussion path, here is yet another example
access control decisions are getting codified directly against
properties of a domain, in particular against VIRQ's which were waived
from XSM control. If having access to certain VIRQ's is going to be a
privilege determination, then perhaps we should visit whether they
should be labeled objects like physical IRQs.
> The patch generally seems okay aside from this.
>
>> {
>> *domid = dom;
>> - d = rcu_lock_domain_by_id(dom);
>> -
>> if ( d )
>> {
>> set_domain_state_info(info, d);
>> -
>> rcu_unlock_domain(d);
>> }
>> else
>> memset(info, 0, sizeof(*info));
>> rc = 0;
>> -
>> break;
>> }
>> +
>> + if ( d )
>> + {
>> + rcu_unlock_domain(d);
>> + d = NULL;
>> + }
>> }
>> out:
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index 29a7726d32d0..2eedc639c72a 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -860,12 +860,9 @@ long
>> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>> break;
>> case XEN_DOMCTL_get_domain_state:
>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d);
>
> With the initial NULL deref issue, I wondered if this wouldn't be better
> off modified to drop the d param - xsm_get_domain_state(XSM_XS_PRIV) -
> and changing flask permissions like:
> allow dom0_t xen_t:xen get_domain_state;
>
> That would gate the external call, and individual domain permissions
> could be checked with xsm_getdomaininfo(), or a new hook if you don't
> want to re-use.
>
> But as your approach avoids passing NULL, it seems okay to me. It also
> doesn't change the flask policy, which is nice.
That's a plain nack from me. Whether it is viewed as a pipe dream or
not, my goal continues to be to work towards enabling the ability to
have a truly disaggregated platform. In the original architecture, it
was envisioned to have multiple toolstack domains, each responsible for
a distinct set of domains. In terms of implementation, that would mean
multiple xenstored instances, each with a purview over a subset of domains.
v/r,
dps
On 2026-02-19 07:15, Daniel P. Smith wrote: > On 2/18/26 18:04, Jason Andryuk wrote: >>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >>> index 29a7726d32d0..2eedc639c72a 100644 >>> --- a/xen/common/domctl.c >>> +++ b/xen/common/domctl.c >>> @@ -860,12 +860,9 @@ long >>> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >>> break; >>> case XEN_DOMCTL_get_domain_state: >>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d); >> >> With the initial NULL deref issue, I wondered if this wouldn't be >> better off modified to drop the d param - >> xsm_get_domain_state(XSM_XS_PRIV) - and changing flask permissions like: >> allow dom0_t xen_t:xen get_domain_state; >> >> That would gate the external call, and individual domain permissions >> could be checked with xsm_getdomaininfo(), or a new hook if you don't >> want to re-use. >> >> But as your approach avoids passing NULL, it seems okay to me. It >> also doesn't change the flask policy, which is nice. > > That's a plain nack from me. Whether it is viewed as a pipe dream or > not, my goal continues to be to work towards enabling the ability to > have a truly disaggregated platform. In the original architecture, it > was envisioned to have multiple toolstack domains, each responsible for > a distinct set of domains. In terms of implementation, that would mean > multiple xenstored instances, each with a purview over a subset of domains. I don't think what I wrote is at odds with your pipe dream. My main concern is passing NULL as a domain. I think that is wrong, beyond the fault seen on ARM. In domain_target_sid(), I think the NULL dst was mistakenly matched to dom0's NULL src->target. src->target_sid is 0 from zalloc, which is not otherwise initialized and invalid. That is returned. Later when sidtab_search() can't find it, it is remapped to SECINITSID_UNLABELED and returned. That silent remapping is dubious, and it points toward unlabled_t should never be allowed in any rule. Maybe it should remap unknown sids to a dedicated invalid_t, but maybe invalid_t is already supposed to be that? Regards, Jason
On 2/23/26 13:01, Jason Andryuk wrote: > On 2026-02-19 07:15, Daniel P. Smith wrote: >> On 2/18/26 18:04, Jason Andryuk wrote: > >>>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >>>> index 29a7726d32d0..2eedc639c72a 100644 >>>> --- a/xen/common/domctl.c >>>> +++ b/xen/common/domctl.c >>>> @@ -860,12 +860,9 @@ long >>>> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >>>> break; >>>> case XEN_DOMCTL_get_domain_state: >>>> - ret = xsm_get_domain_state(XSM_XS_PRIV, d); >>> >>> With the initial NULL deref issue, I wondered if this wouldn't be >>> better off modified to drop the d param - >>> xsm_get_domain_state(XSM_XS_PRIV) - and changing flask permissions like: >>> allow dom0_t xen_t:xen get_domain_state; >>> >>> That would gate the external call, and individual domain permissions >>> could be checked with xsm_getdomaininfo(), or a new hook if you don't >>> want to re-use. >>> >>> But as your approach avoids passing NULL, it seems okay to me. It >>> also doesn't change the flask policy, which is nice. >> >> That's a plain nack from me. Whether it is viewed as a pipe dream or >> not, my goal continues to be to work towards enabling the ability to >> have a truly disaggregated platform. In the original architecture, it >> was envisioned to have multiple toolstack domains, each responsible >> for a distinct set of domains. In terms of implementation, that would >> mean multiple xenstored instances, each with a purview over a subset >> of domains. > > I don't think what I wrote is at odds with your pipe dream. > > My main concern is passing NULL as a domain. I think that is wrong, > beyond the fault seen on ARM. In domain_target_sid(), I think the NULL > dst was mistakenly matched to dom0's NULL src->target. src->target_sid > is 0 from zalloc, which is not otherwise initialized and invalid. That > is returned. Later when sidtab_search() can't find it, it is remapped > to SECINITSID_UNLABELED and returned. That silent remapping is dubious, > and it points toward unlabled_t should never be allowed in any rule. > Maybe it should remap unknown sids to a dedicated invalid_t, but maybe > invalid_t is already supposed to be that? Understand, which in my initial approach I considered instead of setting the d to NULL for DOMID_INVALID, I was considered changing it to dom_xen. This effectively says you must have permission to wildcard query xen for domain states. I opted not to include it as I was able to rework it so that NULL was never used for xsm_get_domain_state(). Your last statement is correct, unlabeled_t literally means it is an unknown object. Therefore, writing any rules for unlabeled_t is dubious at a minimum, but more likely an invalid security statements. The suggestion to write the rule was not meant to be the solution, it was just a means to help allow people to boot their systems for the time being. v/r, dps
On 19.02.2026 00:04, Jason Andryuk wrote:
> On 2026-02-18 14:08, Daniel P. Smith wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>> domid_t *domid)
>> {
>> - unsigned int dom;
>> + unsigned int dom = -1;
>> int rc = -ENOENT;
>> struct domain *hdl;
>>
>> @@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>
>> if ( d )
>> {
>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>> + if ( rc )
>> + return rc;
>> +
>> set_domain_state_info(info, d);
>>
>> return 0;
>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>
> Between the two hunks is this:
>
> hdl = lock_dom_exc_handler();
>
> /*
> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
> * domains having changed state.
> */
> if ( current->domain != hdl )
> {
> rc = -EACCES;
> goto out;
> }
>
> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>
>>
>> while ( dom_state_changed )
>> {
>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>> if ( dom >= DOMID_FIRST_RESERVED )
>> break;
>> +
>> + d = rcu_lock_domain_by_id(dom);
>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>
> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>
>> + {
>> + rcu_unlock_domain(d);
>> + d = NULL;
>> + continue;
>
> ... the caller would continue ...
>
>> + }
>> +
>> if ( test_and_clear_bit(dom, dom_state_changed) )
>
> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
> always get to clear the bit even if it cannot see the result?
I don't think so, no. Whenever a legitimate consumer occurs (the owner of
VIRQ_DOM_EXC can change, after all), it'll then consume the bits as needed.
More generally, I think we're better off not making the code here depend
too much on that special VIRQ_DOM_EXC property.
Jan
On 19.02.26 11:52, Jan Beulich wrote:
> On 19.02.2026 00:04, Jason Andryuk wrote:
>> On 2026-02-18 14:08, Daniel P. Smith wrote:
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
>>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>> domid_t *domid)
>>> {
>>> - unsigned int dom;
>>> + unsigned int dom = -1;
>>> int rc = -ENOENT;
>>> struct domain *hdl;
>>>
>>> @@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> if ( d )
>>> {
>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>> + if ( rc )
>>> + return rc;
>>> +
>>> set_domain_state_info(info, d);
>>>
>>> return 0;
>>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>
>> Between the two hunks is this:
>>
>> hdl = lock_dom_exc_handler();
>>
>> /*
>> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
>> * domains having changed state.
>> */
>> if ( current->domain != hdl )
>> {
>> rc = -EACCES;
>> goto out;
>> }
>>
>> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>>
>>>
>>> while ( dom_state_changed )
>>> {
>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>> if ( dom >= DOMID_FIRST_RESERVED )
>>> break;
>>> +
>>> + d = rcu_lock_domain_by_id(dom);
>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>
>> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>>
>>> + {
>>> + rcu_unlock_domain(d);
>>> + d = NULL;
>>> + continue;
>>
>> ... the caller would continue ...
>>
>>> + }
>>> +
>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>
>> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
>> always get to clear the bit even if it cannot see the result?
>
> I don't think so, no. Whenever a legitimate consumer occurs (the owner of
> VIRQ_DOM_EXC can change, after all), it'll then consume the bits as needed.
> More generally, I think we're better off not making the code here depend
> too much on that special VIRQ_DOM_EXC property.
OTOH a new VIRQ_DOM_EXC owner will result in a complete reset of the bitmap
anyway (that is: the bits for all existing domains will be set, while all
others will be cleared).
Juergen
On 19.02.2026 12:03, Juergen Gross wrote:
> On 19.02.26 11:52, Jan Beulich wrote:
>> On 19.02.2026 00:04, Jason Andryuk wrote:
>>> On 2026-02-18 14:08, Daniel P. Smith wrote:
>>>> --- a/xen/common/domain.c
>>>> +++ b/xen/common/domain.c
>>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
>>>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>> domid_t *domid)
>>>> {
>>>> - unsigned int dom;
>>>> + unsigned int dom = -1;
>>>> int rc = -ENOENT;
>>>> struct domain *hdl;
>>>>
>>>> @@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>>
>>>> if ( d )
>>>> {
>>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>>> + if ( rc )
>>>> + return rc;
>>>> +
>>>> set_domain_state_info(info, d);
>>>>
>>>> return 0;
>>>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> Between the two hunks is this:
>>>
>>> hdl = lock_dom_exc_handler();
>>>
>>> /*
>>> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
>>> * domains having changed state.
>>> */
>>> if ( current->domain != hdl )
>>> {
>>> rc = -EACCES;
>>> goto out;
>>> }
>>>
>>> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>>>
>>>>
>>>> while ( dom_state_changed )
>>>> {
>>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>>> if ( dom >= DOMID_FIRST_RESERVED )
>>>> break;
>>>> +
>>>> + d = rcu_lock_domain_by_id(dom);
>>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>>
>>> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>>>
>>>> + {
>>>> + rcu_unlock_domain(d);
>>>> + d = NULL;
>>>> + continue;
>>>
>>> ... the caller would continue ...
>>>
>>>> + }
>>>> +
>>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>>
>>> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
>>> always get to clear the bit even if it cannot see the result?
>>
>> I don't think so, no. Whenever a legitimate consumer occurs (the owner of
>> VIRQ_DOM_EXC can change, after all), it'll then consume the bits as needed.
>> More generally, I think we're better off not making the code here depend
>> too much on that special VIRQ_DOM_EXC property.
>
> OTOH a new VIRQ_DOM_EXC owner will result in a complete reset of the bitmap
> anyway (that is: the bits for all existing domains will be set, while all
> others will be cleared).
Yes, while writing my reply I wondered whether I should mention that. To keep
things a little more simple, I didn't. Plus for this aspect the last sentence
of my earlier reply also applies.
Jan
On 19.02.26 12:13, Jan Beulich wrote:
> On 19.02.2026 12:03, Juergen Gross wrote:
>> On 19.02.26 11:52, Jan Beulich wrote:
>>> On 19.02.2026 00:04, Jason Andryuk wrote:
>>>> On 2026-02-18 14:08, Daniel P. Smith wrote:
>>>>> --- a/xen/common/domain.c
>>>>> +++ b/xen/common/domain.c
>>>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct xen_domctl_get_domain_state *info,
>>>>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>>> domid_t *domid)
>>>>> {
>>>>> - unsigned int dom;
>>>>> + unsigned int dom = -1;
>>>>> int rc = -ENOENT;
>>>>> struct domain *hdl;
>>>>>
>>>>> @@ -219,6 +219,10 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>>>
>>>>> if ( d )
>>>>> {
>>>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>>>> + if ( rc )
>>>>> + return rc;
>>>>> +
>>>>> set_domain_state_info(info, d);
>>>>>
>>>>> return 0;
>>>>> @@ -238,28 +242,39 @@ int get_domain_state(struct xen_domctl_get_domain_state *info, struct domain *d,
>>>>
>>>> Between the two hunks is this:
>>>>
>>>> hdl = lock_dom_exc_handler();
>>>>
>>>> /*
>>>> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
>>>> * domains having changed state.
>>>> */
>>>> if ( current->domain != hdl )
>>>> {
>>>> rc = -EACCES;
>>>> goto out;
>>>> }
>>>>
>>>> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>>>>
>>>>>
>>>>> while ( dom_state_changed )
>>>>> {
>>>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>>>> if ( dom >= DOMID_FIRST_RESERVED )
>>>>> break;
>>>>> +
>>>>> + d = rcu_lock_domain_by_id(dom);
>>>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>>>
>>>> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>>>>
>>>>> + {
>>>>> + rcu_unlock_domain(d);
>>>>> + d = NULL;
>>>>> + continue;
>>>>
>>>> ... the caller would continue ...
>>>>
>>>>> + }
>>>>> +
>>>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>
>>>> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
>>>> always get to clear the bit even if it cannot see the result?
>>>
>>> I don't think so, no. Whenever a legitimate consumer occurs (the owner of
>>> VIRQ_DOM_EXC can change, after all), it'll then consume the bits as needed.
>>> More generally, I think we're better off not making the code here depend
>>> too much on that special VIRQ_DOM_EXC property.
>>
>> OTOH a new VIRQ_DOM_EXC owner will result in a complete reset of the bitmap
>> anyway (that is: the bits for all existing domains will be set, while all
>> others will be cleared).
>
> Yes, while writing my reply I wondered whether I should mention that. To keep
> things a little more simple, I didn't. Plus for this aspect the last sentence
> of my earlier reply also applies.
I believe flask settings can be changed, right?
If so, clearing the bit would be affecting the scenario where the VIRQ_DOM_EXC
is NOT changed, but the flask settings are updated to allow it obtaining info
about d.
Juergen
On 2026-02-19 06:21, Jürgen Groß wrote:
> On 19.02.26 12:13, Jan Beulich wrote:
>> On 19.02.2026 12:03, Juergen Gross wrote:
>>> On 19.02.26 11:52, Jan Beulich wrote:
>>>> On 19.02.2026 00:04, Jason Andryuk wrote:
>>>>> On 2026-02-18 14:08, Daniel P. Smith wrote:
>>>>>> --- a/xen/common/domain.c
>>>>>> +++ b/xen/common/domain.c
>>>>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct
>>>>>> xen_domctl_get_domain_state *info,
>>>>>> int get_domain_state(struct xen_domctl_get_domain_state *info,
>>>>>> struct domain *d,
>>>>>> domid_t *domid)
>>>>>> {
>>>>>> - unsigned int dom;
>>>>>> + unsigned int dom = -1;
>>>>>> int rc = -ENOENT;
>>>>>> struct domain *hdl;
>>>>>> @@ -219,6 +219,10 @@ int get_domain_state(struct
>>>>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>>>> if ( d )
>>>>>> {
>>>>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>>>>> + if ( rc )
>>>>>> + return rc;
>>>>>> +
>>>>>> set_domain_state_info(info, d);
>>>>>> return 0;
>>>>>> @@ -238,28 +242,39 @@ int get_domain_state(struct
>>>>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>>>
>>>>> Between the two hunks is this:
>>>>>
>>>>> hdl = lock_dom_exc_handler();
>>>>>
>>>>> /*
>>>>> * Only domain registered for VIRQ_DOM_EXC event is allowed
>>>>> to query
>>>>> * domains having changed state.
>>>>> */
>>>>> if ( current->domain != hdl )
>>>>> {
>>>>> rc = -EACCES;
>>>>> goto out;
>>>>> }
>>>>>
>>>>> So it is only the domain with VIRQ_DOM_EXC that can enter the while
>>>>> loop:
>>>>>
>>>>>> while ( dom_state_changed )
>>>>>> {
>>>>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>>>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1,
>>>>>> dom + 1);
>>>>>> if ( dom >= DOMID_FIRST_RESERVED )
>>>>>> break;
>>>>>> +
>>>>>> + d = rcu_lock_domain_by_id(dom);
>>>>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>>>>
>>>>> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>>>>>
>>>>>> + {
>>>>>> + rcu_unlock_domain(d);
>>>>>> + d = NULL;
>>>>>> + continue;
>>>>>
>>>>> ... the caller would continue ...
>>>>>
>>>>>> + }
>>>>>> +
>>>>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>>
>>>>> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC
>>>>> owner
>>>>> always get to clear the bit even if it cannot see the result?
>>>>
>>>> I don't think so, no. Whenever a legitimate consumer occurs (the
>>>> owner of
>>>> VIRQ_DOM_EXC can change, after all), it'll then consume the bits as
>>>> needed.
>>>> More generally, I think we're better off not making the code here
>>>> depend
>>>> too much on that special VIRQ_DOM_EXC property.
>>>
>>> OTOH a new VIRQ_DOM_EXC owner will result in a complete reset of the
>>> bitmap
>>> anyway (that is: the bits for all existing domains will be set, while
>>> all
>>> others will be cleared).
>>
>> Yes, while writing my reply I wondered whether I should mention that.
>> To keep
>> things a little more simple, I didn't. Plus for this aspect the last
>> sentence
>> of my earlier reply also applies.
>
> I believe flask settings can be changed, right?
>
> If so, clearing the bit would be affecting the scenario where the
> VIRQ_DOM_EXC
> is NOT changed, but the flask settings are updated to allow it obtaining
> info
> about d.
Yes, this could happen, so the bits should remain set.
Thanks,
Jason
On 19.02.26 12:31, Jason Andryuk wrote:
> On 2026-02-19 06:21, Jürgen Groß wrote:
>> On 19.02.26 12:13, Jan Beulich wrote:
>>> On 19.02.2026 12:03, Juergen Gross wrote:
>>>> On 19.02.26 11:52, Jan Beulich wrote:
>>>>> On 19.02.2026 00:04, Jason Andryuk wrote:
>>>>>> On 2026-02-18 14:08, Daniel P. Smith wrote:
>>>>>>> --- a/xen/common/domain.c
>>>>>>> +++ b/xen/common/domain.c
>>>>>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct
>>>>>>> xen_domctl_get_domain_state *info,
>>>>>>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct
>>>>>>> domain *d,
>>>>>>> domid_t *domid)
>>>>>>> {
>>>>>>> - unsigned int dom;
>>>>>>> + unsigned int dom = -1;
>>>>>>> int rc = -ENOENT;
>>>>>>> struct domain *hdl;
>>>>>>> @@ -219,6 +219,10 @@ int get_domain_state(struct
>>>>>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>>>>> if ( d )
>>>>>>> {
>>>>>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>>>>>> + if ( rc )
>>>>>>> + return rc;
>>>>>>> +
>>>>>>> set_domain_state_info(info, d);
>>>>>>> return 0;
>>>>>>> @@ -238,28 +242,39 @@ int get_domain_state(struct
>>>>>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>>>>
>>>>>> Between the two hunks is this:
>>>>>>
>>>>>> hdl = lock_dom_exc_handler();
>>>>>>
>>>>>> /*
>>>>>> * Only domain registered for VIRQ_DOM_EXC event is allowed to query
>>>>>> * domains having changed state.
>>>>>> */
>>>>>> if ( current->domain != hdl )
>>>>>> {
>>>>>> rc = -EACCES;
>>>>>> goto out;
>>>>>> }
>>>>>>
>>>>>> So it is only the domain with VIRQ_DOM_EXC that can enter the while loop:
>>>>>>
>>>>>>> while ( dom_state_changed )
>>>>>>> {
>>>>>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>>>>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom + 1);
>>>>>>> if ( dom >= DOMID_FIRST_RESERVED )
>>>>>>> break;
>>>>>>> +
>>>>>>> + d = rcu_lock_domain_by_id(dom);
>>>>>>> + if ( d && xsm_get_domain_state(XSM_XS_PRIV, d) )
>>>>>>
>>>>>> ... if the VIRQ_DOM_EXC owner is denied for domain d ...
>>>>>>
>>>>>>> + {
>>>>>>> + rcu_unlock_domain(d);
>>>>>>> + d = NULL;
>>>>>>> + continue;
>>>>>>
>>>>>> ... the caller would continue ...
>>>>>>
>>>>>>> + }
>>>>>>> +
>>>>>>> if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>>>
>>>>>> ... and this bit would never be cleared. Should the VIRQ_DOM_EXC owner
>>>>>> always get to clear the bit even if it cannot see the result?
>>>>>
>>>>> I don't think so, no. Whenever a legitimate consumer occurs (the owner of
>>>>> VIRQ_DOM_EXC can change, after all), it'll then consume the bits as needed.
>>>>> More generally, I think we're better off not making the code here depend
>>>>> too much on that special VIRQ_DOM_EXC property.
>>>>
>>>> OTOH a new VIRQ_DOM_EXC owner will result in a complete reset of the bitmap
>>>> anyway (that is: the bits for all existing domains will be set, while all
>>>> others will be cleared).
>>>
>>> Yes, while writing my reply I wondered whether I should mention that. To keep
>>> things a little more simple, I didn't. Plus for this aspect the last sentence
>>> of my earlier reply also applies.
>>
>> I believe flask settings can be changed, right?
>>
>> If so, clearing the bit would be affecting the scenario where the VIRQ_DOM_EXC
>> is NOT changed, but the flask settings are updated to allow it obtaining info
>> about d.
>
> Yes, this could happen, so the bits should remain set.
In the above scenario, I think VIRQ_DOM_EXC should be triggered after a flask
policy update in order to let xenstored look for previously missed domain state
changes.
Juergen
© 2016 - 2026 Red Hat, Inc.