A 4K page appears to be able to hold 128 ioreq entries, which luckly
matches the current vCPU limit. However, if we decide to increase the
vCPU limit, that doesn't hold anymore and this function would now
silently fetch a out of bounds pointer.
All architectures have no more than 128 as vCPU limit on HVM guests,
and have pages that are at most 4 KB, so this case doesn't occurs in
with the current limits.
Make sure that out of bounds attempts are reported and adjust the around
logic to at worst crash the offending domain instead.
No functional change.
Reported-by: Julian Vetter <julian.vetter@vates.tech>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
v2:
- check and report instead of ASSERT and eventually crash offending domain
xen/common/ioreq.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index f5fd30ce12..a2a2dafe85 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -100,7 +100,14 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
ASSERT((v == current) || !vcpu_runnable(v));
ASSERT(p != NULL);
- return &p->vcpu_ioreq[v->vcpu_id];
+ if ( likely(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))) )
+ return &p->vcpu_ioreq[v->vcpu_id];
+ else
+ {
+ gprintk(XENLOG_ERR, "Out of bounds vCPU %pv in ioreq server\n", v);
+ WARN();
+ return NULL;
+ }
}
/*
@@ -154,9 +161,17 @@ bool vcpu_ioreq_pending(struct vcpu *v)
static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
{
unsigned int prev_state = STATE_IOREQ_NONE;
- unsigned int state = p->state;
+ unsigned int state;
uint64_t data = ~0;
+ if ( unlikely(!p) )
+ {
+ domain_crash(sv->vcpu->domain);
+ return false;
+ }
+
+ state = p->state;
+
smp_rmb();
/*
@@ -354,7 +369,10 @@ static void ioreq_server_update_evtchn(struct ioreq_server *s,
{
ioreq_t *p = get_ioreq(s, sv->vcpu);
- p->vp_eport = sv->ioreq_evtchn;
+ if ( likely(p) )
+ p->vp_eport = sv->ioreq_evtchn;
+ else
+ WARN();
}
}
@@ -1274,6 +1292,9 @@ int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p,
evtchn_port_t port = sv->ioreq_evtchn;
ioreq_t *p = get_ioreq(s, curr);
+ if ( unlikely(!p) )
+ break;
+
if ( unlikely(p->state != STATE_IOREQ_NONE) )
{
gprintk(XENLOG_ERR, "device model set bad IO state %d\n",
--
2.51.2
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
On 14.11.2025 17:32, Teddy Astie wrote:
> A 4K page appears to be able to hold 128 ioreq entries, which luckly
> matches the current vCPU limit. However, if we decide to increase the
> vCPU limit, that doesn't hold anymore and this function would now
> silently fetch a out of bounds pointer.
>
> All architectures have no more than 128 as vCPU limit on HVM guests,
> and have pages that are at most 4 KB, so this case doesn't occurs in
> with the current limits.
DYM "at least 4 KB"? If there was an arch with 2k pages but 128 vCPU limit,
it would be affected, wouldn't it?
> Make sure that out of bounds attempts are reported and adjust the around
> logic to at worst crash the offending domain instead.
Wouldn't we better prevent creation of such guests? And point out the need
to adjust code by a build-time check?
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -100,7 +100,14 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
> ASSERT((v == current) || !vcpu_runnable(v));
> ASSERT(p != NULL);
>
> - return &p->vcpu_ioreq[v->vcpu_id];
> + if ( likely(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))) )
> + return &p->vcpu_ioreq[v->vcpu_id];
Imo you then also need to use array_access_nospec() here.
> + else
> + {
> + gprintk(XENLOG_ERR, "Out of bounds vCPU %pv in ioreq server\n", v);
> + WARN();
> + return NULL;
> + }
> }
While I'm generally arguing against such needless uses of "else", this one
is imo a particularly bad example. The brace-enclosed scope give the strong
(but misleading) impression that the function is lacking a trailing "return".
Jan
Le 17/11/2025 à 10:29, Jan Beulich a écrit :
> On 14.11.2025 17:32, Teddy Astie wrote:
>> A 4K page appears to be able to hold 128 ioreq entries, which luckly
>> matches the current vCPU limit. However, if we decide to increase the
>> vCPU limit, that doesn't hold anymore and this function would now
>> silently fetch a out of bounds pointer.
>>
>> All architectures have no more than 128 as vCPU limit on HVM guests,
>> and have pages that are at most 4 KB, so this case doesn't occurs in
>> with the current limits.
>
> DYM "at least 4 KB"? If there was an arch with 2k pages but 128 vCPU limit,
> it would be affected, wouldn't it?
>
Yes, made some typo here
>> Make sure that out of bounds attempts are reported and adjust the around
>> logic to at worst crash the offending domain instead.
>
> Wouldn't we better prevent creation of such guests? And point out the need
> to adjust code by a build-time check?
>
So overall just
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index f5fd30ce12..7a0421cc07 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -99,6 +99,7 @@ static ioreq_t *get_ioreq(struct ioreq_server *s,
struct vcpu *v)
ASSERT((v == current) || !vcpu_runnable(v));
ASSERT(p != NULL);
+ BUILD_BUG_ON(HVM_MAX_VCPUS > (PAGE_SIZE / sizeof(struct ioreq)));
return &p->vcpu_ioreq[v->vcpu_id];
}
>> --- a/xen/common/ioreq.c
>> +++ b/xen/common/ioreq.c
>> @@ -100,7 +100,14 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
>> ASSERT((v == current) || !vcpu_runnable(v));
>> ASSERT(p != NULL);
>>
>> - return &p->vcpu_ioreq[v->vcpu_id];
>> + if ( likely(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))) )
>> + return &p->vcpu_ioreq[v->vcpu_id];
>
> Imo you then also need to use array_access_nospec() here.
>
>> + else
>> + {
>> + gprintk(XENLOG_ERR, "Out of bounds vCPU %pv in ioreq server\n", v);
>> + WARN();
>> + return NULL;
>> + }
>> }
>
> While I'm generally arguing against such needless uses of "else", this one
> is imo a particularly bad example. The brace-enclosed scope give the strong
> (but misleading) impression that the function is lacking a trailing "return".
>
> Jan
>
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
On 14/11/2025 4:32 pm, Teddy Astie wrote:
> A 4K page appears to be able to hold 128 ioreq entries, which luckly
> matches the current vCPU limit. However, if we decide to increase the
> vCPU limit, that doesn't hold anymore and this function would now
> silently fetch a out of bounds pointer.
>
> All architectures have no more than 128 as vCPU limit on HVM guests,
> and have pages that are at most 4 KB, so this case doesn't occurs in
> with the current limits.
>
> Make sure that out of bounds attempts are reported and adjust the around
> logic to at worst crash the offending domain instead.
>
> No functional change.
>
> Reported-by: Julian Vetter <julian.vetter@vates.tech>
> Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
> ---
> v2:
> - check and report instead of ASSERT and eventually crash offending domain
>
> xen/common/ioreq.c | 27 ++++++++++++++++++++++++---
> 1 file changed, 24 insertions(+), 3 deletions(-)
>
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index f5fd30ce12..a2a2dafe85 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -100,7 +100,14 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
> ASSERT((v == current) || !vcpu_runnable(v));
> ASSERT(p != NULL);
>
> - return &p->vcpu_ioreq[v->vcpu_id];
> + if ( likely(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))) )
> + return &p->vcpu_ioreq[v->vcpu_id];
> + else
> + {
> + gprintk(XENLOG_ERR, "Out of bounds vCPU %pv in ioreq server\n", v);
> + WARN();
> + return NULL;
> + }
> }
While I appreciate this is fixing a latent bug, it's very invasive *and*
needs removing as part of changing the 128 limit. (i.e. its a lot of
churn for no benefit in the meantime.)
Xen cannot release in a position where the user can request 130 CPUs but
explodes like this on the top two.
Furthermore, the WARN() isn't ratelimited, yet this is
guest-triggerable. What we need to do is borrow WARN_ON_ONCE() from
Linux for cases like this, where it's obvious in the logs but can't be
triggered repeatedly.
~Andrew
© 2016 - 2025 Red Hat, Inc.